Metadata-Version: 2.4
Name: onecite
Version: 0.1.1
Summary: Citation management and academic reference toolkit
Author-email: He Zhiang <ang@hezhiang.com>
License: MIT
Project-URL: Homepage, https://github.com/HzaCode/OneCite
Project-URL: Documentation, https://hezhiang.com/OneCite/
Project-URL: Repository, https://github.com/HzaCode/OneCite
Project-URL: Issues, https://github.com/HzaCode/OneCite/issues
Project-URL: Changelog, https://github.com/HzaCode/OneCite/blob/main/docs/changelog.rst
Project-URL: Bug Tracker, https://github.com/HzaCode/OneCite/issues
Keywords: citation,bibliography,bibtex,academic,reference,doi,arxiv
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Science/Research
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Text Processing :: Markup
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: requests>=2.25.0
Requires-Dist: beautifulsoup4>=4.9.0
Requires-Dist: bibtexparser<2.0.0,>=1.2.0
Requires-Dist: thefuzz>=0.19.0
Requires-Dist: pyyaml>=5.4.0
Requires-Dist: feedparser>=6.0.0
Provides-Extra: scholar
Requires-Dist: scholarly>=1.5.0; extra == "scholar"
Provides-Extra: dev
Requires-Dist: pytest>=6.0; extra == "dev"
Requires-Dist: pytest-cov>=2.10; extra == "dev"
Requires-Dist: flake8>=3.8; extra == "dev"
Requires-Dist: black>=20.8b1; extra == "dev"
Dynamic: license-file



<div align="center">
  <p align="center">
    <img src="https://github.com/HzaCode/OneCite/raw/master/logo_.jpg" alt="OneCite Logo" width="160" />
  </p>

  <h1>OneCite</h1>
  <h3>Citation & Academic Reference Toolkit</h3>
</div>

<div align="center">

[![Downloads](https://img.shields.io/pepy/dt/onecite?style=flat-square&label=Downloads)](https://pepy.tech/project/onecite)
[![Awesome CLI Apps](https://img.shields.io/badge/Featured-Awesome%20CLI%20Apps%2019.2k⭐-FF6B35?style=flat-square&logo=awesome-lists&logoColor=white)](https://github.com/agarrharr/awesome-cli-apps?tab=readme-ov-file#academia)

[![Tests](https://img.shields.io/github/actions/workflow/status/HzaCode/OneCite/tests.yml?style=flat-square&logo=github)](https://github.com/HzaCode/OneCite/actions)
[![codecov](https://img.shields.io/codecov/c/github/HzaCode/OneCite?style=flat-square&logo=codecov)](https://codecov.io/gh/HzaCode/OneCite)
[![PyPI](https://img.shields.io/pypi/v/onecite?style=flat-square&logo=pypi&color=blue)](https://pypi.org/project/onecite/)
[![Python](https://img.shields.io/badge/3.10+-blue?style=flat-square&logo=python)](https://www.python.org)
[![MIT](https://img.shields.io/badge/MIT-green?style=flat-square)](LICENSE)
[![Docs](https://img.shields.io/badge/Docs-Pages-blue?style=flat-square&logo=github)](https://hzacode.github.io/OneCite/)
[![Awesome LaTeX](https://img.shields.io/badge/Awesome-LaTeX-008B8B?style=flat-square&logo=awesome-lists&logoColor=white&labelColor=493267)](https://github.com/egeerardyn/awesome-LaTeX?tab=readme-ov-file#bibliography-tools)


</div>

<p align="center">
  <a href="#-features">Features</a> •
  <a href="#-quick-start">Quick Start</a> •
  <a href="#-advanced-usage">📖 Advanced Usage</a> •
  <a href="#-roadmap">🗺️ Roadmap</a> •
  <a href="#-contributing">🤝 Contributing</a>
</p>

---

<p align="center">
  OneCite is a command-line tool and Python library for citation management. It accepts DOIs, paper titles, arXiv IDs, and mixed inputs, and outputs formatted bibliographic entries.
</p>

---

## Statement of Need

Researchers frequently accumulate reference lists in ad-hoc formats — DOIs copied from browser tabs, arXiv IDs from paper PDFs, titles typed by hand, and BibTeX fragments from various sources. Cleaning these into a consistent, complete `.bib` file is tedious and error-prone.

OneCite solves this by accepting **any mix of identifiers and text queries** and automatically resolving them to structured BibTeX through a pipeline of academic APIs (CrossRef, arXiv, PubMed, Semantic Scholar, and others). It is designed for researchers who work primarily in the terminal, use LaTeX, and want a lightweight, scriptable tool — not a full reference manager.

**When to use OneCite vs. alternatives:**

| Tool | Best for |
|---|---|
| **OneCite** | One-shot conversion of messy reference lists to BibTeX in a terminal/script |
| **Zotero** | Long-term reference management, GUI-based, browser integration |
| **CrossRef API directly** | When you have clean DOIs and want canonical metadata |
| **doi2bib** | Single DOI → BibTeX conversion, no fuzzy matching |

---

## Features

| Feature                 | Description                                                                                             |
| ----------------------- | ------------------------------------------------------------------------------------------------------- |
| **Fuzzy Matching**          | Match references against multiple academic databases even from incomplete or inaccurate info.         |
| **Multiple Formats**        | Input `.txt`/`.bib` → Output **BibTeX**.                                                             |
| **4-stage Pipeline**        | A 4-stage process (clean → query → validate → format) to produce consistent output.                  |
| **Field Completion**        | Enrich entries by filling in missing fields like journal, volume, pages, authors, and abstract.                |
| 🎓 **7+ Citation Types**    | Handles journal articles, conference papers, books, software, datasets, theses, and preprints.        |
| **Multi-Source Lookup**     | Queries CrossRef, arXiv, PubMed, Semantic Scholar, Google Books, and others for every entry.         |
| **Many Identifier Types**   | Accepts DOI, PMID, arXiv ID, ISBN, GitHub URL, Zenodo DOI, or plain text queries.                    |
| 🎛️ **Interactive Mode**    | Manually select the correct entry when multiple potential matches are found.                          |
| **Custom Templates**        | YAML-based presets that provide a fallback BibTeX entry type when auto-detection is inconclusive.    |


## 🌐 Data Sources

<div align="center">

[![CrossRef](https://img.shields.io/badge/CrossRef-B31B1B?style=for-the-badge&logo=crossref&logoColor=white)](https://www.crossref.org/)
[![Semantic Scholar](https://img.shields.io/badge/Semantic-1857B6?style=for-the-badge&logo=semanticscholar&logoColor=white)](https://www.semanticscholar.org/)
[![PubMed](https://img.shields.io/badge/PubMed-326599?style=for-the-badge&logo=pubmed&logoColor=white)](https://pubmed.ncbi.nlm.nih.gov/)
[![arXiv](https://img.shields.io/badge/𝒳_arXiv-B31B1B?style=for-the-badge)](https://arxiv.org/)
[![DataCite](https://img.shields.io/badge/DataCite-00B4A0?style=for-the-badge&logo=datacite&logoColor=white)](https://datacite.org/)
[![Zenodo](https://img.shields.io/badge/Zenodo-0A0E4A?style=for-the-badge&logo=zenodo&logoColor=white)](https://zenodo.org/)
[![Google Books](https://img.shields.io/badge/Google-4285F4?style=for-the-badge&logo=google&logoColor=white)](https://books.google.com/)
</div>


## Quick Start

Install and try OneCite in a few steps.

### 1. Installation
```bash
# Recommended: Install from PyPI
pip install onecite
```

### 2. Create an Input File
Create a file named `references.txt` with your mixed-format references:
```text
# references.txt
# Add blank lines between entries to avoid misidentification

10.1038/nature14539

Attention is all you need, Vaswani et al., NIPS 2017

Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MIT Press.

https://github.com/tensorflow/tensorflow

10.5281/zenodo.3233118

arXiv:2103.00020

Smith, J. (2020). Neural Architecture Search. PhD Thesis. Stanford University.
```

### 3. Run OneCite
Execute the command to process your file and generate a clean `.bib` output.
```bash
onecite process references.txt -o results.bib --quiet
```

### 4. View Output
Your `results.bib` file now contains entries of different types.

<details>
<summary><strong>View Complete Output (results.bib)</strong></summary>

```bibtex
@article{LeCun2015Deep,
  doi = "10.1038/nature14539",
  title = "Deep learning",
  author = "LeCun, Yann and Bengio, Yoshua and Hinton, Geoffrey",
  journal = "Nature",
  year = 2015,
  volume = 521,
  number = 7553,
  pages = "436-444",
  publisher = "Springer Science and Business Media LLC",
  url = "https://doi.org/10.1038/nature14539",
  type = "journal-article",
  abstract = "Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction...",
}
@inproceedings{Vaswani2017Attention,
  arxiv = "1706.03762",
  title = "Attention Is All You Need",
  author = "Vaswani, Ashish and Shazeer, Noam and Parmar, Niki and Uszkoreit, Jakob and Jones, Llion and Gomez, Aidan N. and Kaiser, Lukasz and Polosukhin, Illia",
  year = 2017,
  booktitle = "Advances in Neural Information Processing Systems (NeurIPS)",
  url = "https://arxiv.org/abs/1706.03762",
}
# ... and 5 more entries ...
```

</details>

## 📖 Advanced Usage

<details>
<summary><strong>Direct String and Stdin Input</strong></summary>

```bash
onecite process "10.1038/nature14539"
onecite process "Attention is all you need, Vaswani et al., NIPS 2017"
echo "10.1038/nature14539" | onecite process -
```
</details>

<details>
<summary><strong>Interactive Disambiguation</strong></summary>

For ambiguous entries, use the `--interactive` flag to manually select the correct match and ensure accuracy.

**Command**:
```bash
onecite process ambiguous.txt --interactive
```

**Example Interaction**:
```
Found multiple possible matches for "Deep learning Hinton":
1. Deep learning
   Authors: LeCun, Yann; Bengio, Yoshua; Hinton, Geoffrey
   Journal: Nature, 2015
   DOI: 10.1038/nature14539

2. Deep belief networks
   Authors: Hinton, Geoffrey E.
   Journal: Scholarpedia, 2009
   DOI: 10.4249/scholarpedia.5947

Please select (1-2, 0=skip): 1
Selected: Deep learning
```
</details>

<details>
<summary><strong>🐍 Use as a Python Library</strong></summary>

Use OneCite directly in your Python scripts.

```python
from onecite import process_references

# A callback can be used for non-interactive selection (e.g., always choose the best match)
def auto_select_callback(candidates):
    return 0 # Index of the best candidate

result = process_references(
    input_content="Deep learning review\nLeCun, Bengio, Hinton\nNature 2015",
    input_type="txt",
    template_name="journal_article_full",
    output_format="bibtex",
    interactive_callback=auto_select_callback
)

print('\n\n'.join(result['results']))
```
</details>



## 🗺️ Roadmap

- **OneCite Skill** — Skill package for AI coding agents (Claude Code, Windsurf, etc.).

## 🤝 Contributing

Contributions are always welcome! Please see [**CONTRIBUTING.md**](CONTRIBUTING.md) for development guidelines and instructions on how to submit a pull request.

## 📄 License

This project is licensed under the **MIT License**. See the [**LICENSE**](LICENSE) file for details.

### Disclosure

Development was assisted by standard productivity tools including Generative AI for streamlining implementation details. All output was verified and integrated by the maintainer, and **no LLMs are used by the package at runtime**.

---

<div align="center">

**OneCite**

<p>
  <a href="https://github.com/HzaCode/OneCite">Star on GitHub</a> •
  <a href="http://hezhiang.com/onecite">Web App</a> •
  <a href="https://github.com/HzaCode/OneCite/issues">🐛 Report an Issue</a> •
  <a href="https://github.com/HzaCode/OneCite/discussions">Discussions</a>
</p>

</div>
