I am trying to import a JSON file for use in a Python editor so that I can perform analysis on the data. I am quite new to Python so not sure how I am meant to achieve this. My JSON file is full of tweet data, example shown here:
{"id":441999105775382528,"score":0.0,"text":"blablabla","user_id":1441694053,"created":"Fri Mar 07 18:09:33 GMT 2014","retweet_id":0,"source":"<a href=\"http://twitterfeed.com\" rel=\"nofollow\">twitterfeed</a>","geo_long":null,"geo_lat":null,"location":"","screen_name":"SevenPS4","name":"Playstation News","lang":"en","timezone":"Amsterdam","user_created":"2013-05-19","followers":463,"hashtags":"","mentions":"","following":1062,"urls":"http://bit.ly/1lcbBW6","media_urls":"","favourites_count":4514,"reply_status_id":0,"reply_user_id":0,"is_truncated":false,"is_retweet":false,"original_text":null,"status_count":4514,"description":"Tweeting the latest Playstation news!","url":null,"utc_offset":3600}
My questions:
How do I import the JSON file so that I can perform analysis on it in a Python editor?
How do I perform analysis on only a set number of the data (IE 100/200 of them instead of all of them)?
Is there a way to get rid of some of the fields such as score
, user_id
, created
, etc without having to go through all of my data manually to do so?
Some of the tweets have invalid/unusable symbols within them, is there anyway to get rid of those without having to go through manually?