Questions tagged [cyrillic]

For questions about code that deals with any kind of Cyrillic (including the original Cyrillic script and the modern Cyrillic alphabets, such as the Russian alphabet).

519 questions
17
votes
4 answers

Python — check if a string contains Cyrillic characters

How to check whether a string contains Cyrillic characters? E.g. >>> has_cyrillic('Hello, world!') False >>> has_cyrillic('Привет, world!') True
Max Malysh
  • 21,876
  • 16
  • 91
  • 102
17
votes
4 answers

Java java.io.filenotfoundexception for file path with cyrillic characters

I have a file whose name contains characters not only from the plain ASCII character set, but also from a non-ASCII character set. In my case it contains Cyrillic characters. Here's a snippet of my code: String fileName =…
Dmitry Nelepov
  • 7,106
  • 6
  • 49
  • 69
16
votes
2 answers

Fastest way to encode cyrillic letters for url

If you copy the link below into the browser http://be.wikipedia.org/wiki/Беларусь it will show the Wiki article. But once you want to copy that link (or any other link that contains cyrillic symbols) from the browser url into the notepad, you'll…
Haradzieniec
  • 8,150
  • 26
  • 100
  • 199
14
votes
3 answers

Copying Cyrillic URLs in Google Chrome?

Has anyone solved the problem with copying and pasting links from Cyrillic? What steps will reproduce the problem? In chromium bug copying and pasting links from Cyrillic :-( What is the expected result? expect a normal copy and paste links from…
user3436007
13
votes
2 answers

MySQL - Russian characters display incorectly

I have to make an russian version of a website, but I can't find out, how to insert russian characters into Database. I tryed almost every possible coding, but it only shows: ???????? ?????????? ??????? ??????? ? ????? ?? ????????????? ? ???????,…
Mike
  • 6,471
  • 16
  • 48
  • 67
12
votes
2 answers

PostgreSQL sorting with Cyrillic "ь"

Just take a look, please: WITH toks AS ( SELECT tok FROM unnest('{ь, а, чь, ча, чль, чла}'::text[]) AS tok ORDER BY tok COLLATE "uk_UA" ) SELECT ROW_NUMBER() OVER() AS "#", tok FROM toks ORDER BY tok COLLATE "uk_UA" PostgreSQL 9.3…
brownian
  • 427
  • 3
  • 12
11
votes
2 answers

Handle Turkish uppercase and lowercase correctly, need to modify/override built-in functions?

I am working with multilingual text data, among others with Russian using the Cyrillic alphabet and Turkish. I basically have to compare the words in two files my_file and check_file and if the words in my_file can be found in check_file, write them…
Fable
  • 111
  • 5
11
votes
2 answers

Detect Russian / cyrillic in Javascript string?

I'm trying to detect if a string contains Russian (cyrillic) characters or not. I'm using this code: term.match(/[\wа-я]+/ig); but it doesn't work – or in fact it just returns the string back as it is. Can somebody help with the right code? Thanks!
Aerodynamika
  • 6,253
  • 10
  • 54
  • 104
10
votes
2 answers

'Wide character in subroutine entry" - UTF-8 encoded cyrillic words as sequence of bytes

I am working on an Android word game with a large dictionary - The words (over 700 000) are kept as separate lines in a text file (and then put in an SQLite database). To keep competitors from extracting my dictionary, I'd like to encode all words…
Alexander Farber
  • 18,345
  • 68
  • 208
  • 375
9
votes
2 answers

HTML5 Encoding & Cyrillic

Something that made me curious - supposedly the default character encoding in HTML5 is UTF-8. However if I have a plain simple HTML file with an HTML5 doctype like the code below, I get: "hello" in Russian: "ЗдраÑтвуйте" In Chrome 33+,…
dkugappi
  • 2,364
  • 5
  • 19
  • 21
9
votes
6 answers

Unable to print russian characters

I have a russian string which i have encoded to UTF-8 String str = "\u041E\u041A"; System.out.println("String str : " + str); When i print the string in eclipse console i get ?? Can anyone suggest how to print the russian strings to console or what…
Rohit
  • 371
  • 2
  • 3
  • 10
9
votes
4 answers

regexp with russian lang

I can't solve my problem with regexp. Ok, when i type: $string = preg_replace("#\[name=([a-zA-Z0-9 .-]+)*]#","$name_start $1 $name_end",$string); everything is ok, except situation with Russian language. so, i try to re-type this reg-exp: $string =…
vorobey
  • 4,101
  • 3
  • 16
  • 18
8
votes
2 answers

Wine and Cyrillic Fonts

Run Wine under Linux Mint 17.2. Cyrillic names of programs, menu items names, text files - all are unreadable. Some exceptions do exist. For example, I can see Cyrillic text in CoDeSys IDE, but all my keyboard input is "????" on a screen. And…
drvlas
  • 323
  • 1
  • 5
  • 16
8
votes
2 answers

Manipulating files with non-English names in R

When using the R functions to manipulate files in Windows, e.g. dir(), those with non-English characters, like Cyrillic, are presented as a sequence of "?". Similarly, when using file.rename(), if the new name contains non-English characters, the…
antonio
  • 9,285
  • 10
  • 59
  • 113
7
votes
2 answers

Python JSON to CSV - bad encoding, UnicodeDecodeError: 'charmap' codec can't decode byte

I have a problem converting nested JSON to CSV. For this i use https://github.com/vinay20045/json-to-csv (forked a bit to support python 3.4), here is full json-to-csv.py file. Converting is working, if i set #Base Condition else: …
Vic Nicethemer
  • 1,011
  • 2
  • 14
  • 36
1
2 3
34 35