0

How From a chain of identical characters standing in a row (not whitespace, not tabulators), only one such character must be left?

I mean next. here input data

5 NO 58AA WOD~05293_NODC~58AA 005450 WOD~NO005450 6246.630096435547 418.6500072479248 22.540432044045843 -299.02194134859894 06.01.2013 656 368 .NULL. z nc_unique_cast~5#wod_unique_cast~15713249#lat~62.7771682739258#lon~4.31083345413208#time~88759.2895833254#date~20130106#GMT_time~6.94999980926514#Access_no~110866###Bottom_Depth~368######z_row_size~361#Temperature_row_size~361#Temperature_WODprofileflag~0#Salinity_row_size~361#Salinity_WODprofileflag~0#Pressure_row_size~361######country~NORWAY#WOD_cruise_identifier~NO005450##Platform~HAAKON MOSBY (R/V;call sign LJIT;uilt Sep 1980;IMO7922233)######dataset~CTD###dbase_orig~ICES (International Council for the Exploration of the Sea)#########ODV_yyyy-mm-ddThh:mm:ss.sss~2013-01-06T06:56:59.9993#Platform_WOD_code~5293##code_name_units_Platform~5293#Platform_by_WODSelect~HAAKON MOSBY (R/V;call sign LJIT;uilt Sep 1980;IMO7922233)#Platform_NODC_code~58AA#Platform_Name~HAAKON MOSBY (R/V;call sign LJIT;uilt Sep 1980;IMO7922233)#Platform_by_s_3_platform~HAAKON MOSBY (R/V;call sign LJIT;uilt Sep 1980;IMO7922233)# ocldb1538536654.25679_CTD.nc 15713249 .NULL. HAAKON MOSBY (R/V;call sign LJIT;uilt Sep 1980;IMO7922233) 0 .NULL. 368 361 361 722 .NULL. .NULL. 06.01.2013 06:56:59.9993

and output

5 NO 58AA WOD~05293_NODC~58AA 005450 WOD~NO005450 6246.630096435547 418.6500072479248 22.540432044045843 -299.02194134859894 06.01.2013 656 368 .NULL. z nc_unique_cast~5#wod_unique_cast~15713249#lat~62.7771682739258#lon~4.31083345413208#time~88759.2895833254#date~20130106#GMT_time~6.94999980926514#Access_no~110866#Bottom_Depth~368_row_size~361#Temperature_row_size~361#Temperature_WODprofileflag~0#Salinity_row_size~361#Salinity_WODprofileflag~0#Pressure_row_size~361#country~NORWAY#WOD_cruise_identifier~NO005450#Platform~HAAKON MOSBY (R/V;call sign LJIT;uilt Sep 1980;IMO7922233)#dataset~CTD#dbase_orig~ICES (International Council for the Exploration of the Sea)#ODV_yyyy-mm-ddThh:mm:ss.sss~2013-01-06T06:56:59.9993#Platform_WOD_code~5293#code_name_units_Platform~5293#Platform_by_WODSelect~HAAKON MOSBY (R/V;call sign LJIT;uilt Sep 1980;IMO7922233)#Platform_NODC_code~58AA#Platform_Name~HAAKON MOSBY (R/V;call sign LJIT;uilt Sep 1980;IMO7922233)#Platform_by_s_3_platform~HAAKON MOSBY (R/V;call sign LJIT;uilt Sep 1980;IMO7922233)# ocldb1538536654.25679_CTD.nc 15713249 .NULL. HAAKON MOSBY (R/V;call sign LJIT;uilt Sep 1980;IMO7922233) 0 .NULL. 368 361 361 722 .NULL. .NULL. 06.01.2013 06:56:59.9993

I.E. Replace all occurring chains of the symbol "#" with a single symbol "#"

How to do it?

d-max
  • 167
  • 8
  • related: https://stackoverflow.com/questions/4736/learning-regular-expressions https://stackoverflow.com/questions/22937618/reference-what-does-this-regex-mean – jogo Nov 14 '18 at 08:33

2 Answers2

3

You can use gsub to replace 2 or more "#" with just one. For example:

gsub("#{2,}", "#", c("test", "test###test", "test#test", "##test####"))
# [1] "test"      "test#test" "test#test" "#test#"  
Ben Bolker
  • 173,430
  • 21
  • 312
  • 389
MrFlick
  • 163,738
  • 12
  • 226
  • 242
3

You can just replace all instances of any number of "#" with a "#" using gsub

gsub("#+","#", your_vector)
GordonShumway
  • 1,741
  • 7
  • 15