pybliometrics.scopus.CitationOverview

CitationOverview() implements the Citation Overview API. Your API Key requires manual approval by Elsevier. Please contact Scopus for approval. Otherwise each request throws a 403 error. Ideally provide the key via apikey=”XXX” when initiating the class, which will override the ones provided in the configuration file.

Documentation

class pybliometrics.scopus.CitationOverview(identifier, start, end=2023, id_type='scopus_id', eid=None, refresh=False, citation=None, **kwds)[source]

Interaction with the Citation Overview API.

Parameters:
  • identifier (List[Union[str, int]]) – Up to 25 identifiers for which to look up citations. Must be Scopus IDs, DOIs, PIIs or Pubmed IDs.

  • start (Union[int, str]) – The first year for which the citation count should be loaded.

  • end (Union[int, str], optional) – The last year for which the citation count should be loaded. Defaults to the current year.

    Default: 2023

  • id_type (str, optional) – The type of the IDs provided in identifier. Must be one of “scopus_id”, “doi”, “pii”, “pubmed_id”.

    Default: 'scopus_id'

  • eid (str, optional) – (deprecated) The Scopus ID of the abstract - will be removed in a future release: Instead use param scopus_id after stripping the part until the second hyphen. If you use this parameter, it will be converted to scopus_id instead.

    Default: None

  • refresh (Union[bool, int], optional) – Whether to refresh the cached file if it exists or not. If int is passed, cached file will be refreshed if the number of days since last modification exceeds that value.

    Default: False

  • citation (Optional[str], optional) – Allows for the exclusion of self-citations or those by books. If None, will count all citations. Allowed values: None, exclude-self, exclude-books

    Default: None

  • kwds (str) – Keywords passed on as query parameters. Must contain fields and values mentioned in the API specification at https://dev.elsevier.com/documentation/AbstractCitationAPI.wadl.

Raises:
  • ValueError – If parameter identifier contains fewer than 1 or more than 25 elements.

  • ValueError – If any of the parameters citation, id_type or refresh is not one of the allowed values.

Notes

The directory for cached results is {path}/STANDARD/{id}-{citation}, where path is specified in your configuration file, and id the md5-hashed version of a string joining identifier on underscore.

Your API Key needs to be augmented by Elsevier’s Scopus Integration Team to access this API.

property authors: List[NamedTuple | None] | None

A list of lists of namedtuples storing author information, where each namedtuple corresponds to one author and each sub-list to one document. The information in each namedtuple is (name surname initials id url). All entries are strings.

property cc: List[List[Tuple[int, int]]]

List of lists of tuples of yearly number of citations for specified years, where each sub-list corresponds to one document.

property citationType_long: List[str] | None

Type (long version) of the documents (e.g. article, review).

property citationType_short: List[str] | None

Type (short version) of the documents (e.g. ar, re).

property columnTotal: int

The yearly number of citations for all documents combined.

property doi: List[str] | None

Document Object Identifier (DOI) of the documents.

property endingPage: List[str] | None

Ending pages of the documents.

property grandTotal: int

The total number of citations of all documents together.

property h_index: int

Combined h-index of citations of all the documents.

property issn: List[str | Tuple[str, str] | None] | None

ISSN of the publishers of the documents. Note: If E-ISSN is known to Scopus, this returns both ISSN and E-ISSN in random order separated by blank space.

property issueIdentifier: List[str | None] | None

Issue numbers of the documents.

property laterColumnTotal: int

The total number of citations for all years after the end year for all documents combined.

property lcc: List[int]

Number of citations after the end year of each document.

property pcc: int

Number of citations before the start year.

property pii: List[str | None] | None

The Publication Item Identifier (PII) of the documents.

property prevColumnTotal: int

The total number of citations for all years before the start year for all documents combined.

property rangeColumnTotal: int

The total number of citations for all specified years for all documents combined.

property rangeCount: List[int]

Total citation count over the specified year range for each document.

property rowTotal: List[int]

Total number of citations (specified and omitted years) for each document.

property scopus_id: List[int]

The Scopus ID(s) of the documents. Might differ from the ones provided.

property sortTitle: List[str | None] | None

Name of source the documents are published in (e.g. the Journal).

property startingPage: List[str | None] | None

Starting page.

property title: List[str]

Titles of each document.

property url: List[str]

URL(s) to Citation Overview API view of each document.

property volume: str | None

Volume for the abstract.

get_cache_file_age()

Return the age of the cached file in days.

Return type:

int

get_cache_file_mdate()

Return the modification date of the cached file.

Return type:

str

get_key_remaining_quota()

Return number of remaining requests for the current key and the current API (relative on last actual request).

Return type:

str | None

get_key_reset_time()

Return time when current key is reset (relative on last actual request).

Return type:

str | None

Examples

The class can download yearly citation counts for up to 25 documents at once. Simply provide a list of either the Scopus identifiers, the DOIs, the PIIs or the pubmed IDs and specify the identifier type in id_type. The API needs to know for which years you want to retrieve yearly citation counts. Therefore you need to set the year from which on CitationOverview() will return yearly citation counts (e.g., the publication year). If no ending year is given, CitationOverview() will use the current year. Optionally you can exclude citations by books or self-citation via exclude.

You initialize the class with a list of identifiers:

>>> from pybliometrics.scopus import CitationOverview
>>> identifier = ["85068268027", "84930616647"]
>>> co = CitationOverview(identifier, start=2019, end=2021)

You can obtain basic information just by printing the object:

>>> print(co)
2 document(s) has/have the following total citation count
as of 2021-07-17:
    16; 13

The key attribute is cc, which provides a list of tuples storing year-wise citations to the article. Each list corresponds to one document, in the order specified when initating the class:

>>> co.cc
[[(2019, 0), (2020, 6), (2021, 10)],
 [(2019, 2), (2020, 2), (2021, 1)]]

The attributes pcc, rangeCount, lcc and rowTotal provide citation summaries for each document. pcc is the count of citations before the specified year, rangeCount the count of citations for the specified years, and lcc the count of citations after the specified year. For the sum (i.e., the total number of citations by document) use rowTotal

>>> co.pcc
[0, 8]
>>> co.rangeCount
[16, 5]
>>> co.lcc
[0, 0]
>>> co.rowTotal
[16, 13]

The columnTotal attribute represents the total number of yearly citations for all documents combined, which rangeColumnTotal summarizes. Finally grandTotal is the total number of citations for all documents combined.

>>> co.columnTotal
[2, 8, 11]
>>> co.rangeColumnTotal
21
>>> co.grandTotal
29

With the citation parameter, you can exclude self-citations or citations from books:

>>> co_self = CitationOverview(identifier, start=2019, end=2021,
                               citation="exclude-self")
>>> print(co_self)
2 document(s) has/have the following total citation count
excluding self-citations as of 2021-07-17:
    14; 11
>>> co_books = CitationOverview(identifier, start=2019, end=2021,
                                citation="exclude-books")
>>> print(co_books)
2 document(s) has/have the following total citation count
excluding citations from books as of 2021-07-17:
    16; 13

Author information is also stored as lists of namedtuples:

>>> co.authors[0]
[Author(name='Rose M.E.', surname='Rose', initials='M.E.', id='57209617104',
        url='https://api.elsevier.com/content/author/author_id/57209617104'),
 Author(name='Kitchin J.R.', surname='Kitchin', initials='J.R.', id='7004212771',
        url='https://api.elsevier.com/content/author/author_id/7004212771')]
>>> co.authors[1]
[Author(name='Kitchin J.R.', surname='Kitchin', initials='J.R.', id='7004212771',
        url='https://api.elsevier.com/content/author/author_id/7004212771')]

Via co.authors[0][0].id one can for instance obtain further author information via the AuthorRetrieval() class.

Finally, there are bibliographic information, too:

>>> co.title
['pybliometrics: Scriptable bibliometrics using a Python interface to Scopus',
 'Examples of effective data sharing in scientific publishing']
>>> co.publicationName
['SoftwareX', 'ACS Catalysis']
>>> co.volume
['10', '5']
>>> co.issueIdentifier
[None, '6']
>>> co.citationType_long
['Article', 'Review']

Using pandas, you can convert the citation counts into a DataFrame as follows:

>>> import pandas as pd
>>> df = pd.concat([pd.Series(dict(x)) for x in co.cc], axis=1).T
>>> df.index = co.scopus_id
>>> print(df)
             2019  2020  2021
85068268027     0     6    10
84930616647     2     2     1

Downloaded results are cached to expedite subsequent analyses. This information may become outdated. To refresh the cached results if they exist, set refresh=True, or provide an integer that will be interpreted as maximum allowed number of days since the last modification date. For example, if you want to refresh all cached results older than 100 days, set refresh=100. Use ab.get_cache_file_mdate() to obtain the date of last modification, and ab.get_cache_file_age() to determine the number of days since the last modification.