I have the below data.
• PRT_Edit & Set Shopping Cart in Retail
• PRT_Confirm Shopping Cart for Goods
o PRT-Ret_Process Supplier Invoice
o PRT-Web_Overview of Orders
o PRT_Update Outfirst Agreement
PRT_Axn_-Purchase and Requisition
The data has special symbols, tab space and spaces. I want to extract only the text part from this data as:
PRT_Edit & Set Shopping Cart in Retail
PRT_Confirm Shopping Cart for Goods
PRT-Ret_Process Supplier Invoice
PRT-Web_Overview of Orders
PRT_Update Outfirst Agreement
I have tried using REGEX_EXTRACT_ALL in Pig Script as below but it does not work.
PRT = LOAD '/DATA' USING TEXTLOADER() AS (LINE:CHARARRAY);
Cleansed = FOREACH PRT GENERATE REGEX_EXTRACT_ALL(LINE,'[A-Z]*') AS DATA;
When I try dumping Cleansed, it does not show any data. Can any one please help.