
A reusable skill that transforms any BibTeX file into 12 professional charts, summary statistics, and a publication-ready report. The Python equivalent of R's bibliometrix package.
68 peer-reviewed articles on AI in Auditing (2020–2025)
Documents
0
Sources
0
Authors
0
Co-Authors/Doc
0
Keywords
0
With DOI
0
Four-step pipeline from raw BibTeX to publication-ready visualizations
Export a .bib file from Zotero, Mendeley, Scopus, or Web of Science.
Execute the script with optional year filters and custom domain keywords.
Inspect 12 charts, JSON statistics, and CSV/XLSX data exports.
Use the Markdown template to build a narrative around the generated visualizations.

Three commands to go from .bib to charts
sudo pip3 install bibtexparser wordcloud networkx openpyxlpython /home/ubuntu/skills/bibliometric-analysis/scripts/run_analysis.py \
my_references.bib \
./output \
--year-min 2020 --year-max 2025pandoc output/report.md -o output/report.docx| Parameter | Required | Description |
|---|---|---|
bib_file | Yes | Path to the .bib (BibTeX) file |
output_dir | Yes | Directory for charts and data output |
--year-min | No | Minimum publication year to include |
--year-max | No | Maximum publication year to include |
--domain-keywords | No | Text file with custom keywords (one per line) |
Generated from 68 articles on AI in Auditing (2020–2025)

Bar chart with cumulative line showing publication growth over time.

Top 15 journals and proceedings ranked by article count.

Top 15 authors by number of publications in the corpus.

Bubble chart showing top authors' production across years.

Top 20 domain keywords extracted from titles and abstracts.

Network graph of keyword relationships and co-occurrences.

Visual representation of keyword frequency and prominence.

Distribution of single vs. multi-authored publications.

Network of collaboration between the top 20 authors.

How research themes shifted across time periods.

Author productivity distribution following Lotka's inverse square law.

Sources × Keywords × Authors relationship visualization.
Everything the script generates in a single run
chart_01 – chart_12.png
12 publication-quality PNG charts at 150 DPI
bibliometric_stats.json
Summary statistics as structured JSON
bibliometric_data.csv
Cleaned article-level data in CSV format
bibliometric_data.xlsx
Same data exported as Excel workbook
report_template.md
Markdown template with placeholders for commentary
customization.md
Reference guide for tuning keywords, colors, and chart parameters
Adapt the analysis to any research domain
Create a plain-text file with one keyword per line to override the default AI + Auditing vocabulary:
artificial intelligence
machine learning
audit
auditing
accounting
ethics
professional judgment
formationThen pass it with --domain-keywords keywords.txt
Keywords extracted from titles, abstracts, and author keywords using a domain-specific dictionary
Journal names cleaned of LaTeX artifacts ({}, \&, \textbackslash)
Conference proceedings from booktitle fields included with (Proceedings) suffix
Thematic evolution automatically splits corpus into 3 equal time periods
All charts use 150 DPI, white backgrounds, and Material Design color palette