Questions tagged [stringi]

stringi is THE R package for fast, correct, consistent and convenient string/text processing in each locale and any native character encoding. The use of the ICU library gives R users a platform-independent set of functions known to Java, Perl, Python, PHP, and Ruby programmers.

's stringi package provides a platform independent way of manipulating strings. It is built on the library and has a syntax inspired by the package.

Repositories

Other resources

Related tags

237 questions
28
votes
6 answers

Error in R: (Package which is only available in source form, and may need compilation of C/C++/Fortran)

I'm trying to install the 'yaml' and 'stringi' packages in R-Studio, and it keeps giving me these errors: > install.packages("stringi") Package which is only available in source form, and may need compilation of C/C++/Fortran: ‘stringi’ These will…
wanax
  • 281
  • 1
  • 3
  • 3
26
votes
1 answer

gsub speed vs pattern length

I've been using gsub extensively lately, and I noticed that short patterns run faster than long ones, which is not surprising. Here's a fully reproducible code: library(microbenchmark) set.seed(12345) n = 0 rpt = seq(20, 1461, 20) msecFF =…
Alexey Ferapontov
  • 4,529
  • 4
  • 18
  • 36
22
votes
6 answers

package 'stringi' does not work after updating to R3.2.1

I saw a version of this question posted, but still did not see the answer. I am trying to use ggplot2 but get the following errors (everything worked this morning using R3.0.2 'frisbee sailing' with RStudio version 0.98.1102. I updated both R and…
Kodiakflds
  • 359
  • 1
  • 2
  • 11
17
votes
5 answers

Subset string by counting specific characters

I have the following strings: strings <- c("ABBSDGNHNGA", "AABSDGDRY", "AGNAFG", "GGGDSRTYHG") I want to cut off the string, as soon as the number of occurances of A, G and N reach a certain value, say 3. In that case, the result should…
Nivel
  • 429
  • 3
  • 9
16
votes
5 answers

How to install stringi from local file (ABSOLUTELY no Internet Access)

I am working on a remote server using RStudio. This server has no access to the Internet. I would like to install the package "stringi." I have looked at this stackoverflow article, but whenever I use the…
Katya Handler
  • 1,956
  • 4
  • 20
  • 40
15
votes
2 answers

R/regex with stringi/ICU: why is a '+' considered a non-[:punct:] character?

I'm trying to remove non-alphabet characters from a vector of strings. I thought the [:punct:] grouping would cover it, but it seems to ignore the +. Does this belong to another group of characters? library(stringi) string1 <- c( "this is a…
screechOwl
  • 23,958
  • 54
  • 146
  • 246
14
votes
6 answers

Overlapping matches in R

I have searched and was able to find this forum discussion for achieving the effect of overlapping matches. I also found the following SO question speaking of finding indexes to perform this task, but was not able to find anything concise about…
hwnd
  • 65,661
  • 4
  • 77
  • 114
13
votes
2 answers

Error in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]) : there is no package called 'stringi'

When I use library(Hmisc) I get the following error Error in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]) : there is no package called 'stringi' Error: package 'ggplot2' could not be loaded As well, if I use…
Marta
  • 131
  • 1
  • 1
  • 3
12
votes
2 answers

How to detect sentence boundaries with OpenNLP and stringi?

I want to break next string into sentences: library(NLP) # NLP_0.1-7 string <- as.String("Mr. Brown comes. He says hello. i give him coffee.") I want to demonstrate two different ways. One comes from package openNLP: library(openNLP) #…
SRRussel
  • 121
  • 4
12
votes
2 answers

Split keep repeated delimiter

I'm trying to use the stringi package to split on a delimiter (potentially the delimiter is repeated) yet keep the delimiter. This is similar to this question I asked moons ago: R split on delimiter (split) keep the delimiter (split) but the…
Tyler Rinker
  • 99,090
  • 56
  • 292
  • 477
11
votes
5 answers

Installation of packages ‘stringr’ and ‘stringi’ had non-zero exit status

Please help me to install stringr and stringi packages in R. The result is: install.packages("stringi") Installing package into ‘C:/Users/kozlovpy/Documents/R/win-library/3.2’ (as ‘lib’ is unspecified) пробую URL…
Pavel Kozlov
  • 121
  • 1
  • 1
  • 4
11
votes
6 answers

How to install stringi library from archive and install the local icu52l.zip

We're bumbling through making some R code work in a production environment and as part of that we're installing some R packages as follows: # Default directories and mirrors WORKING_DIR <- "/srv/foo/bar/baz" LIB_DIR <- paste( WORKING_DIR,…
Adam Taylor
  • 6,773
  • 8
  • 40
  • 52
9
votes
2 answers

dplyr filter condition to distinguish between unicode symbol and its unicode representation

I am trying to filter the Symbol column based on whether it's of the form \uxxxx This is easy visually, that is, some look like $, ¢, £, and others like \u058f, \u060b, \u07fe. But I cannot seem to figure it out using stringi /…
stevec
  • 15,490
  • 6
  • 67
  • 110
9
votes
1 answer

Filter by multiple patterns with filter() and str_detect()

I would like to filter a dataframe using filter() and str_detect() matching for multiple patterns without multiple str_detect() function calls. In the example below I would like to filter the dataframe df to show only rows containing the letters a f…
user6571411
  • 1,898
  • 2
  • 11
  • 24
9
votes
3 answers

Unexpected behaviour with str_replace "NA"

I'm trying to convert a character string to numeric and have encountered some unexpected behaviour with str_replace. Here's a minimum working example: library(stringr) x <- c("0", "NULL", "0") # This works, i.e. 0 NA 0 as.numeric(str_replace(x,…
jkeirstead
  • 2,561
  • 3
  • 19
  • 25
1
2 3
15 16