Thursday, March 14, 2013

the genius of unicode in javascript identifiers

intellisense for javascript sucks !

i've used recent versions of webstorm (jetbrains javascript IDE) and netbeans, and older versions of eclipse and a few others. and i've googled and read extensively - as far as i can tell webstorm has the best support for javascript code understanding (netbeans used to be a close 2nd, but the latest nb7.3 version has been a step backwards and ws6 is a step forwards - leaving a gap)

but even in webstorm it sucks. for the most part 'find usages' just does a text search - it can't distinguish between 2 methods with the same name in unrelated prototypes, even in the simplest cases with enough jsdoc annotations to make it unambiguous. 'go to declaration' works some of the time, but still fails often in cases where plenty of information is available

i spent some time submitting bug reports, trying new IDEs, annotating extensively with jsdoc, refactoring my code to be easier to parse, etc. to little effect and i've ultimately accepted that the only way to be able to reliably navigate in javascript code is by using unique indentifiers

i tried a number of schemes

  • long java-style names, eg setDayOfWeek
  • various prefix schemes using the same prefix for all methods in a prototype, eg for the ViewDiary prototype i might have vdSet and vdMerge or vd_set and vd_merge
  • various suffix schemes similar to the prefixes
long names are ugly, harder to read, and didn't really provide much uniqueness. prefixes are ugly and make code completion a pain ... typing 'set<tab>' didn't provide any options, i needed to remember the prototype prefix to get anything useful. underscore-based suffixes worked the best - they're ugly, but don't harm code completion

taking the suffix idea to the 'logical' extreme, i switched from _XX suffixes to unicode suffixes. a single unobtrusive character (i mostly use subscripts) that a human hardly notices when reading, but allows the IDE to keep everything unique
/** * state representation of a nu.FoodList * @constructor * @augments nu.Stateᵧ */ nu.Stateᵧ.Listᵧ = function() { this.replyᵧ = null; this.termᵧ = null; this.startᵧ = 0; this.kfoodᵧ = 0; this.derivedᵧ = false; this.jumpEndᵧ = false; }; nu.Stateᵧ.Listᵧ.prototype = { setᵧ: function(term,start,kfood) { this.termᵧ = term; this.startᵧ = +start || 0; this.kfoodᵧ = +kfood || 0; return this; } }; they're easy to read, compatible with completion, and easy to keep unique. both webstorm and nb7.2 handle them elegantly - find usages and go to declaration work great. the downside is that they're a pain to type without completion (i copy/paste)

unfortunately, chrome devtools won't do completion with object properties that contain non-ascii characters. firebug works fine

Note: not all unicode characters are valid in javascript identifiers, here is one validator

update - bash support

by default my inputrc (ubuntu 12.10 with gnome classic) doesn't appear to support unicode copy or paste, ie bash doesn't work !!! i needed to enable support for meta characters ... echo " set input-meta on set output-meta on set convert-meta off " >> ~/.inputrc bind -f ~/.inputrc most of the ubuntu apps, eg gnome-terminal and gedit, appear to support typing unicode characters with shift+control+u followed by the hex value of the character and space or enter. unfortunately, neither webstorm nor netbeans support this interface, meaning you need to copy and paste the character from some program that does support them, eg the gnome character map

for a full list of the unicode characters, install the ubuntu unicode-data package or download the raw list

2 comments:

Jason Worley said...

What effect does this have when the code reaches the browser? Or are you relying on obfuscation to remove the Unicode characters before they get there?

seth/nqzero said...

unicode characters work fine in the browser. and they should - they're officially part of the spec (it's not all characters - i linked to a validator in the article (http://mothereff.in/js-variables)

for the most part, any character that's a letter is valid

the only issue that i've got so far is the developer tools for chrome won't tab-complete identifiers with non-ascii characters. firebug works fine, and i've submitted an issue for chrome