Articles Database Analysis#

Script written to query, analyze, and plot different research related databases.

This module presents functions for querying the number of published papers , or google searches, containing a user defined keyword, over a custom range of time, with subsequent plotting of this data, for demonstrating the interest evolution on this topic over the years.

The module workflow is based on my personal preferences, managing most of the datas as pandas dataframes, with the possibility of saving it as a csv, as well as reading a stored csv for analysis and plotting.

The code cells were designed based on a non-paid API key user, not being the most optimized case for the user possesing one of these paids APIs.

More specifically, this notebook presents interactions with:

  • Google Trends

  • Scopus Database

  • PubMed Database

API_Keys#

PubMed NCBI API key is optional, and the script was developed aiming to a workflow without the key. Feel free to dig further the API if it interests you.

PubMed NCBI Database

Fetch PubMed NCBI database, providing alternatives to analyze scientific articles interest over the time, on a specific subject.




Packages Installation#

All interactions with PubMed NCBI Database is done through the python package Metapub.

This package can be installed running the following on a terminal:

python -m pip install metapub

Other important requirements are present in the database_analysis module folder requirements.txt.

All interactions with Scopus Database is done through the python package Pybliometrics.

This DataBase interaction is done through the API key and the requests python package.




Client usage#

  • To generate a csv with published articles containing a keyword from 2000 t0 2023, with number of articles per month, run:

python cli.py --pubmed '<keyword>' 2000 2023 -o <output_path>
  • Note: PubMed database was created on january 1996

  • Default qurying interval if not inputted will be 1996-2023.

This notebook require the following packages to be installed to be fully executed:

python -m pip install pandas pytrends metapub matplotlib pybiometrics numpy

Basic workflow is executed by the client module, linke the following examples:

Functions#