I was struggling on this whenever I had to work on multilingual site (esp. Japanese and Chinese) and users are allowed to enter characters in regional language.
Asked
Active
Viewed 905 times
1 Answers
0
Characters can be single byte ,double byte, triple byte and so on. Single byte follows in a particular range. Same thing is true for other characters. Based on this I have created following functions that will calculate the size of a string on the basis of memory
function getByteLength(normal_val) {
// Force string type
normal_val = String(normal_val);
var byteLen = 0;
for (var i = 0; i < normal_val.length; i++) {
var c = normal_val.charCodeAt(i);
byteLen += c < (1 << 7) ? 1 :
c < (1 << 11) ? 2 :
c < (1 << 16) ? 3 :
c < (1 << 21) ? 4 :
c < (1 << 26) ? 5 :
c < (1 << 31) ? 6 : Number.NaN;
}
return byteLen;
}
So above function can be modified to find out whether a function is single byte or multi-bytes.
Following js fiddle determines the size of entered text in terms of memory.
http://jsfiddle.net/paraselixir/d83oaa3v/5/
so if string has x characters and memory size is y so if x === y then all characters are single bytes if 2*x === y then all characters are double bytes otherwise string is combination of single and double/multi bytes.
![](../../users/profiles/1506462.webp)
paraS elixiR
- 671
- 1
- 7
- 17