使用Python挖掘newsfeed文本并提取其观点
1、环境搭建
本使用了Anaconda的python3.6虚拟环境,并且需要额外安装以下package:
- tqdm (a progress bar python utility): pip install tqdm
- nltk (for natural language processing): pip install nltk
- bokeh (for interactive data viz): conda install bokeh
- gensim: pip install --upgrade gensim
xxxxxxxxxximport requestsfrom bs4 import BeautifulSoupimport pandas as pdfrom datetime import datetimefrom tqdm import tqdm, tqdm_notebookfrom functools import reducedef getSources():• source_url = 'https://newsapi.org/v1/sources?language=en'• response = requests.get(source_url).json()• sources = []• for source in response['sources']:• sources.append(source['id'])• return sources