Thursday, 15 December 2016

grep highlighting

I frequently use grep to demonstrate and explain regular expressions (regex). I use it in interactive mode with the input coming from the keyboard and the output going to the screen. So, I type some string and if grep finds a match this input string is echoed to the screen. If no match is found then this input string is not echoed to the screen. I have used this teaching method for many years.

Recently, whilst using CentOS, I discovered that grep can highlight matched strings. The CentOS machine I used was setup with grep highlighting which is how I discovered it. I was impressed as it makes it clear exactly which text is matched.

My Mac OSX does not have grep highlighting with the default settings. I therefore decided to configure my OSX system so it does highlight grep matches as it is so useful. Rather than having to repeatedly type the relevant grep otions on the command line, I put them into my .bash_profile, as follows.

export GREP_OPTIONS='--color=auto'
export GREP_COLOR='1;34' # 1=bold; 34=blue

I now give a grep terminal session extract which illustrates non matching and matching.

苹果电脑 ~: egrep '노팅엄'
안산 안양 부산 구미 제주 포항 양산
안산 안양 부산 노팅엄 구미 제주 포항 양산
안산 안양 부산 노팅엄 구미 제주 포항 양산

The text used in this terminal session is Korean Hangeul. Each word is a Korean city, apart from 노팅엄 which is Nottingham, a city in England. The Korean cities are: 안산 Ansan, 안양 Anyang, 부산 Busan, 구미 Gumi, 제주 Jeju, 포항 Pohang and 양산 Yangsan.

Note: I use egrep as it is short form for grep -E which enables extended regular expressions.

Environment: OSX Sierra v10.12.1

Saturday, 26 November 2016

Domain Name Registrations

To keep up to date with Domain Name registrations I highly recommend gd-domains. It gives listings of newly registered Domain Names on a daily basis. Listings for individual TLDs (Top Level Domains) are available. It is thanks to this site that I discovered the below impressive and sizeable family of Korean Domain Names. They were registered on 11th and 22nd November 2016. The TLD used is 닷컴 which is Verisign's Korean equivalent to com.

I think embedding telephone numbers into these IDNs (Internationalized Domain Names) is clever marketing ☎️

  1. 남양주용달이사-010-3126-0853.닷컴
  2. 원룸반포장이사-010-3126-0853.닷컴
  3. 마포포장이사-010-3126-0853.닷컴
  4. 강동구이사-010-3126-0853.닷컴
  5. 강서구포장이사-010-3126-0853.닷컴
  6. 광진구원룸이사-010-3126-0853.닷컴
  7. 광진구이사짐센터-010-3126-0853.닷컴
  8. 송파구포장이사-010-3126-0853.닷컴
  9. 중랑구원룸이사-010-3126-0853.닷컴
  10. 서초구원룸이사-010-3126-0853.닷컴
  11. 송파구용달센터-010-3126-0853.닷컴
  12. 학생이사-010-3126-0853.닷컴
  13. 사당동원룸이사-010-3126-0853.닷컴
  14. 지방용달가격-010-3126-0853.닷컴
  15. 싼곳용달이사-010-3126-0853.닷컴
  16. 마포용달이사-010-3126-0853.닷컴
  17. 반포장이사-010-3126-0853.닷컴
  18. 서울일반이사-010-3126-0853.닷컴
  19. 1톤트럭이사-010-3126-0853.닷컴
  20. 1톤소형이사-010-3126-0853.닷컴
  21. 1톤용달-010-3126-0853.닷컴
  22. 서울1톤용달-010-3126-0853.닷컴
  23. 원룸포장이사-010-3126-0853.닷컴
  24. 원룸이사가격-010-3126-0853.닷컴
  1. 마포원룸이사-010-4675-2414.닷컴
  2. 강동구용달이사-010-4675-2414.닷컴
  3. 강서구용달이사-010-4675-2414.닷컴
  4. 강동구지역이사-010-4675-2414.닷컴
  5. 서울개인용달이사-010-4675-2414.닷컴
  6. 광진구용달이사-010-4675-2414.닷컴
  7. 송파구원룸이사-010-4675-2414.닷컴
  8. 동작구용달이사-010-4675-2414.닷컴
  9. 중랑구용달이사-010-4675-2414.닷컴
  10. 송파구용달이사-010-4675-2414.닷컴
  11. 서초구용달이사-010-4675-2414.닷컴
  12. 서울소형이사-010-4675-2414.닷컴
  13. 오피스텔이사-010-4675-2414.닷컴
  14. 지방용달이사-010-4675-2414.닷컴
  15. 용산용달이사-010-4675-2414.닷컴
  16. 용달차이사-010-4675-2414.닷컴
  17. 합정동용달이사-010-4675-2414.닷컴
  18. 서울반포장이사-010-4675-2414.닷컴
  19. 서울경기용달차-010-4675-2414.닷컴
  20. 친절원룸이사-010-4675-2414.닷컴
  21. 원룸투룸-010-4675-2414.닷컴
  22. 원룸이사비용-010-4675-2414.닷컴
  23. 화물차용달-010-4675-2414.닷컴
  24. 용달이사견적-010-4675-2414.닷컴

www.gd-domains.com/20161111-229 is the link for 닷컴 registrations on 11th November 2016. www.gd-domains.com/20161122-229 is the direct link for all 닷컴 registrations on 22nd November 2016, 2016년 11월 22일 화요일.

Wednesday, 26 October 2016

Family of Korean IDNs

The following is a list of functioning Korean IDNs (Internationalized Domain Names). They all belong to the same Computer Repair Company. The TLD (Top Level Domain) used is 닷컴 which is Verisign's new Korean language equivalent to their com TLD. Each IDN contains 컴퓨터수리 which means Computer Repair. The only difference between these IDNs is the first two characters which are the names of South Korean cities. I think this is clever and creative use of IDNs!

The last two IDNs below are structured differently. The first two characters are, I think, a neighbourhood and the first two characters after the hyphen are the city.

The cities are: 시흥 Siheung, 부천 Bucheon, 창원 Changwon, 마산 Masan, 평택 Pyeongtaek, 오산 Osan, 진해 Jinhae, 김해 Gimhae, 부산 Busan.

  1. 시흥컴퓨터수리.닷컴
  2. 부천컴퓨터수리.닷컴
  3. 창원컴퓨터수리.닷컴
  4. 마산컴퓨터수리.닷컴
  5. 평택컴퓨터수리.닷컴
  6. 오산컴퓨터수리.닷컴
  7. 진해컴퓨터수리.닷컴
  8. 김해컴퓨터수리.닷컴
  9. 북동컴퓨터수리-창원컴퓨터수리.닷컴
  10. 우동컴퓨터수리-부산컴퓨터수리.닷컴

Update 9th March 2017: Here is another family of Computer Repair 컴퓨터수리 IDNs with a different registrant.

  1. 김포컴퓨터수리.닷컴
  2. 안양컴퓨터수리.닷컴
  3. 용인컴퓨터수리.닷컴
  4. 용산컴퓨터수리.닷컴
  5. 대구컴퓨터수리.닷컴
  6. 종로컴퓨터수리.닷컴
  7. 강남컴퓨터수리.닷컴
  8. 파주컴퓨터수리.닷컴
  9. 일산컴퓨터수리.닷컴
  10. 성남컴퓨터수리.닷컴

Friday, 7 October 2016

Computer Science Internationalization — Bidi

Scripts such as Latin are written from Left to Right (L➡︎R). Scripts such as Arabic and Hebrew are written Right to Left (L⬅︎R). What happens when we mix L➡︎R and L⬅︎R scripts within a document? Here is an exercise in mixing scripts.

Take a mixed bidi (bidirectional) string consisting of Latin and Hebrew characters in a L➡︎R paragraph.

abcאבגdef

...and here is the same string in a L⬅︎R paragraph.

abcאבגdef

Now to the actual exercise. Copy the above stings to your text editor or word processor. You will need to setup the 2nd occurrence of the string as a L⬅︎R paragraph. I am assuming that your directionality is L➡︎R by default. Each string has two boundaries where the text changes direction. For each boundary you are going to insert a character, either a L➡︎R, such as x, or a L⬅︎R, such as ד. For each insertion operation use the initial mixed bidi string. There are two mixed strings above and so there are a total of 8 insertion operations. The challenge is to predict where in the strings the inserted character will appear before you actually insert the character. Give it a go! Good luck😀

If I did this exercise before I ever studied bidi, I would probably have scored 4/8. Now I understand how the computer is processing this bidi text and so I usually score full marks for such exercises. It is though not an intuitive process for me as I have spent most of my life reading and writing L➡︎R scripts only. I have to think very carefully as to how the computer does it in order to determine the correct answers.

The main purpose of this exercise is to think about the ordering of the characters in the strings. There are two orderings to consider: memory order and display order. Memory order is how it is logically saved in memory which in this case is the order in which I typed it. The memory order of the string I have used above is "abcגבאdef". Display order is how it is presented to the viewer. You have already seen, above, the two possible display orders for the single string "abcגבאdef".

I have used TextEdit for this exercise. In order to set paragraph text direction in TextEdit follow the path: "TextEdit➜ Format➜ Text➜ Writing Direction". Now set paragraph text direction to Right to Left. TextEdit correctly handles bidi text but that is not the case for all word processors or text editors.

There are several permutations of this exercise, including:

  1. What happens at the boundaries with forward delete and back delete?
  2. What happens if the initial memory order character(s) are L⬅︎R instead of L➡︎R?
  3. Use Arabic instead of Hebrew as this introduces the additional challenge of letters changing shape according to preceding and following characters.

This article is aimed at L➡︎R reading/writing people. If you are a L⬅︎R person then you will need to invert some of my instructions. Actually, if you are a L⬅︎R person you will be totally familiar with mixing bidi text and so will fully understand this exercise.

Environment: OSX v10.12 (Sierra), TextEdit v1.12

Wednesday, 21 September 2016

Computer Science Curriculum Internationalization

I have been a long time practitioner and advocate of internationalising Computer Science teaching. My fundamental aim is to give students global computing skills. One such global skill, for example, is the processing of Unicode text rather than the very restricted ASCII text. Once one encompasses Unicode then one is encompassing most languages and scripts of the world.

Over the years I have tried to find other like minded Computer Science educators but have had no success. I had more or less concluded I am a solitary voice when it comes to Computer Science internationalisation. There does though appear to be some light as I recently discovered two organisations that promote internationalisation of teaching curricula.

 The Centre for Curriculum Internationalisation (CCI) which is based at Oxford Brookes University, UK. brookes.ac.uk/services/cci/index.html In addition to their website they have a google discussion group. I posted some information on my Computer Science Internationalisation initiatives and practices to this google forum. Please see groups.google.com/forum/#!topic/cicin/6XJCrqcdLD4

Internationalisation of the Curriculum (IoC) in action which is based in Australia. ioc.global

Update 19th March 2017: I reached out to people and groups and I conclude I am still a solitary voice with respect to Computer Science Curricula Internationalization in UK Universities. I do believe UK Universities will have to embrace Computer Science Internationalization but I think it will be at least ten years before that happens. So, why do I persist? Am I wrong? Well, if I am wrong then so are, Google, Wikipedia, Facebook, Nivea, Booking.com, Nestlé, Hotels.com, Pampers, Intel, Microsoft, Philips, Adobe, Twitter and many many more companies. They all operate globally and are all producing software for the world. These global companies need graduates who have the skills and attitude necessary for building global software.
Note: I have taken these company names from The top 25 global websites from the 2017 Web Globalization Report Card globalbydesign.com/2017/02/16/the-top-25-global-websites-from-the-2017-web-globalization-report-card/

Update 1st October 2017:  I recently created an open forum specifically for discussion on the topic of Computer Science/ICT/IT curricula internationalisation. If this topic interests you, please become a member and join in the discussions. It is open to all. Please see groups.google.com/forum/m/#!forum/computer-science-curriculum-internationalization

Thursday, 25 August 2016

Internationalizing Regular Expressions

The purpose of this post is to encourage all of you who are teaching Regular Expressions (RegExp) or are learning RegExp to think international. Think beyond ASCII. Thinking international means thinking Unicode instead of ASCII. Once one thinks Unicode then one is encompassing the world.

My RegExp teaching slides use ASCII only as a starting point. They then progress to Unicode. I give one of my slides as an example.


There is a lot of information packed into this one slide which needs some explanation. My example slide is using Unicode Chinese characters and Unicode Emoji characters.
  • 人 is a Unicode Chinese character meaning person
  • 鸭 is a Unicode Chinese character meaning duck
  • 鸡 is a Unicode Chinese character meaning chicken
This slide also contains a cultural reference. Some time ago I came across a Weibo 微博 post about the visit to Hong Kong by the big floating yellow duck http://edition.cnn.com/2013/05/02/travel/hong-kong-giant-duck/ The Weibo post had a photo containing many people looking at the duck. The text of the Weibo post was:- 

人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人鸭人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人人

 When I saw this I thought it so funny and very clever. It just would not work in English but works so perfectly in Chinese. When writing my RegExp slides I remembered this Weibo post and thought this would make for an excellent cultural connection. Thus my slide is internationalized by using Unicode and incorporating a cultural reference. The use of Unicode is essential for internationalisation. Incorporating a cultural reference is optional but it does add an extra dimension that may well serve to make RegExp slides more interesting and encourage readers to explore the boundless potential of internationalized Regular Expressions.

 Note: I have tried to find the Weibo post but have been unsuccessful so I cannot, unfortunately, provide a reference.

Monday, 14 March 2016

Oracle Forms Debugging: Using the “DEBUG_MESSAGES=YES” runtime option - Handling "Please Acknowledge" message

Oracle Forms Debugging: Using the “DEBUG_MESSAGES=YES” runtime option - Handling "Please Acknowledge" message

“DEBUG_MESSAGES=YES” is a “quick and dirty” technique for pinpointing code problems. This causes a message to automatically display to let you know each trigger as it executes. Once you encounter the runtime error, you will know the last trigger that fired and should be able to trace through the code from that point. For anyone that was around in the SQL*Forms 3.0 days, this was the default behavior when running a form in debug mode.

To use this option from the Form Builder, check the “debug_messages” option on the runtime tab after choosing Tools -> preferences from the menu. From the command line, simply add the “debug_messages=yes” option. 

If you are using an icon in Windows to start your form, add the option to the shortcut command. It would look something like this for Windows platforms:

c:\orant\bin\ifrun60.exe module=myform userid=scott/tiger debug_messages=yes

On Unix:
f60runm module=myform userid=scott/tiger debug_messages=yes

The major disadvantage of using this technique is that you have to acknowledge every message as each trigger executes, which can be very annoying. You will keep getting "Please acknowledge message" prompt each time whenever a new trigger is encountered. Although it shows the triggers that fire, it does not show program units that execute, and there is no way to display the values of variables. It can also cause the same focus problems and program flow interruption that you see with the “MESSAGE” built-in. However, this method is useful if you have no idea which trigger is causing the problem; once you have narrowed down the scope, other troubleshooting techniques would be more appropriate.

Sunday, 13 March 2016

Arabic Email Addresses

Most human language scripts are written from Left to Right (L➡︎R). Arabic is written Right to Left (L⬅︎R). An email address written in the Latin script would be displayed L➡︎R — username@domain-name. An Arabic email address, on the other hand, would normally and without intervention be displayed L⬅︎R as domain-name@username.

Letʼs take a fictitious Arabic email address — خالد@الدوحة.قطر
  • خالد is the username Khalid
  • الدوحة is the 2nd level domain name Doha
  • قطر is the Top Level Domain (TLD) Qatar. This part is not fictitious as قطر is a valid ccTLD.
Your browser should be displaying the email address خالد@الدوحة.قطر in L⬅︎R order which is not an order familiar to most L➡︎R readers and so requires some effort to parse.

When text has mixed L➡︎R and L⬅︎R characters it is referred to as Bidirectional (bidi) text. There is a complex Unicode algorithm specifically to determine  display order of bidi text unicode.org/reports/tr9/ If you read this report you will see something called Directional Isolates.

In the html world there are tags and attributes specific to bidi. One such tag is <bdi> which is bidi isolate. Using such html bidi isolation one can incorporate Arabic email addresses that are natural to read for both L➡︎R and L⬅︎R readers. These addresses can be written such that their overall text direction adheres to the text direction of the context. This context may be the direction of the whole html document or some subpart such as a paragraph.

First we will setup our html with a L➡︎R context for L➡︎R readers. The below paragraph (p) is setup with dir (direction) to ltr (left to right). The email address has 3 components: username, 2nd level domain name and TLD. Each component is direction isolated. This gives an email address whose overall direction is L➡︎R. The text of each component is, as it should be, L⬅︎R. I posit that this is much easier for a L➡︎R reader to comprehend. It is now obvious, for instance, to determine which is the username and which is the TLD.

The html code
<p dir="ltr"><bdi>خالد</bdi>@<bdi>الدوحة</bdi>.<bdi>قطر</bdi></p>
displays the address as
خالد@الدوحة.قطر

 
But how will the address be displayed if the context is changed to rtl (right to left). The code correctly displays the whole address in L⬅︎R order, both overall direction and text direction of each component. Thus we have also catered for L⬅︎R readers without changing relevant address display html code.

The html code
<p dir="rtl"><bdi>خالد</bdi>@<bdi>الدوحة</bdi>.<bdi>قطر</bdi></p>
displays the address as
خالد@الدوحة.قطر

Just in case your browser cannot, as yet, handle bidi isolates here are the 2 contexts in image format.


Monday, 29 February 2016

Oracle Forms Exception Handling: NO_DATA_FOUND, TOO_MANY_ROWS and OTHERS

Oracle Forms Exception Handling: NO_DATA_FOUND, TOO_MANY_ROWS and OTHERS

EXCEPTION block in PLSQL Oracle Forms is used to track the exceptions. Following is the PLSQL code snippet which uses NO_DATA_FOUND, TOO_MANY_ROWS and OTHERS exceptions. If the SQL SELECT query does not return any data, NO_DATA_FOUND exception is fired. If the SQL SELECT query returns more than one row where it was expected to return only one row, TOO_MANY_ROWS exception can be used to track this kind of exception. If you are not sure what kind of exception can the code throw, use OTHERS exception.

DECLARE
  DEPARTMENT_NAME VARCHAR(60);
BEGIN
  SELECT DEPTNAME INTO DEPARTMENT_NAME FROM DEPT WHERE DEPTNO = 20;
EXCEPTION
  WHEN NO_DATA_FOUND THEN
    MESSAGE('No data found');
  WHEN TOO_MANY_ROWS THEN
    MESSAGE('More than one row found');
  WHEN OTHERS THEN
    NULL; -- don't do anything and just return from the procedure
END;

Difference between WHEN-VALIDATE-ITEM and KEY-NEXT-ITEM triggers

Difference between WHEN-VALIDATE-ITEM and KEY-NEXT-ITEM triggers

WHEN-VALIDATE-ITEM and KEY-NEXT-ITEM triggers are very close to each other and create a lot of confusion. Following are three differences between them to clear the picture a little bit.

1. Whenever the user changes the value in the item and tries to move out of that item using ENTER or TAB or MOUSE, WHEN-VALIDATE-ITEM trigger is fired. But, in case of KEY-NEXT-ITEM trigger, if user moves out using MOUSE, it will not fire. So, the validation written on this trigger will not fire. Better use, WHEN-VALIDATE-ITEM trigger in this case as it also works with MOUSE.

2. KEY-NEXT-ITEM trigger fires before the WHEN-VALIDATE-ITEM trigger.

3. KEY-NEXT-ITEM trigger will fire every time you move to the next field from that field but WHEN-VALIDATE-ITEM will fire only when you have acutally made any changes to that item. If you have made no changes in the item, it will not fire when you move out this item.

Personally, I prefer to use WHEN-VALIDATE-ITEM trigger in many situations.

Sunday, 28 February 2016

Oracle Forms Tutorials: WHEN-VALIDATE-ITEM trigger

Oracle Forms Tutorials: WHEN-VALIDATE-ITEM trigger

Consider that you have an oracle form on which there is a datablock which uses EMP table. EMP table has a column called SALARY. Now there is a contraint on the SALARY column that it should be greater than or equal to $1000. Your requirement is that whenever any user fills salary in the Oracle Forms ITEM (say ITEM_SALARY) and tabs out from that item, a validation should fire and if the value filled is not valid, it should give you an error message and does not let the cursor go to the other item. In these situations WHEN-VALIDATE-ITEM trigger is used. Following is the PLSQL code you should write on the WHEN-VALIDATE-ITEM trigger of the ITEM_SALARY item.

IF :ITEM_SALARY < 1000 THEN
    MESSAGE('ERROR: Salary must be at least $1000 or more.');
    RAISE FORM_TRIGGER_FAILURE; -- To keep the cursor in the item
END IF;

You should also go through this video on YOUTUBE by Edward Honour.

In this video, he tries to pick the department name when the user enters the department number. If department number does not exist in the database, he shows the error message and does not let the cursor to go to the other item using FORM_TRIGGER_FAILURE trigger. He has used NO_DATA_FOUND exception for showing the error message. Following is the code used in this video:

BEGIN
SELECT DEPT_NAME INTO :BLOCKNAME.ITEMNAME FROM DEPT WHERE        DEPT_NO = :BLOCKNAME.ITEMNAME2;
EXCEPTION
WHEN NO_DATA_FOUND THEN
MESSAGE('Invalid Department Number');
RAISE FORM_TRIGGER_FAILURE;
END

5 Things you should never discuss your manager

5 Things you should never discuss your manager

An employee should be fair and transparent with his manager, but few revelations can spoil your bonding with your manager. Try never ever talking about these things to your manager:

1. Your doubt on manager' decisions - At times, you may want to raise questions on his judgement. Do yourself a favour - give it a pass. You may not know what business pressures he/she is living through.

2. Snooping on manager's life - Every employee wants to check what his or her manager is doing in social life. Don't blow your cover by admitting to doing it unashamed.

3. Office gossip - Never share office hearsay with your manager without checking its truthfulness. Sharing something random that you heard in the cafeteria and reacting to it may get him or her thinking that you are not mature or trustworthy.

4. Your personal secrets - Your life and its constant struggles are for you to handle. Pouring it all out in front of your manager will put you in a vulnerable position.

5. Your expectations from the manager - The manager is mandated to put forward his/her expectations from you. But it does not mean you to go and tell the manager how you evaluate his/her skill sets!

Friday, 1 January 2016

Emoji by Name

Here is a method for typing Emoji by name but not by English name. This method is for writing Emoji by Chinese name. OSX provides a Pinyin Input Method for writing Chinese. Pinyin is a romanization of Chinese. When writing in pinyin a candidate window pops up which lists all possible Chinese characters 汉字 and Emoji.

Candidate Window - Frequency

Candidate Window — Emoji

Here is a small sample of the Emoji which can be typed using this Pinyin Input Method. Each line below starts with the pinyin followed by the Emoji. The pinyin can have multiple meanings, multiple candidate Chinese characters and hence multiple Emoji. Hopefully for the examples I have given below you will be able to work out the meanings from the Emoji. Some of the below pinyin represent objects and some represent emotions.

  1. ai — ❤️ 😘 💗 💓 😍
  2. bei shang — 😢 😭
  3. che — 🚗 🚘
  4. hou — 🐒 🐵
  5. hua — 🌹 🌼 💐 🌷 🌸 🌺
  6. ka fei — ☕️
  7. kai xin — 😄 😺 😃 😆 ☺️
  8. mao — 🐱 🐈 ⚓️
  9. niu — 🐂 🐃 🐄 🐮
  10. pi jiu — 🍺 🍻
  11. sheng qi — 😠 😡 💢 😾
  12. shu — 🌲 🌳 🌴 🐭
  13. shui guo — 🍉 🍊 🍇 🍈 🍌 🍍 🍎 🍑 🍒 🍓 🍅 🍆 🍋 🍏 🍐
  14. tuo la ji — 🚜
  15. xiang — 🐘
  16. xiao — 😊 😄
  17. xie — 👟 👠
  18. xue ren — ⛄️ ☃️
  19. yin yue — 🎵 🎷 🎶 🎸 🎹 🎺 🎻 🎼 🎤 🎧 📯
  20. yu — 🐟 🐠

Environment: OSX El Capitan v10.11.2