pybliometrics.scopus.ScopusSearch

class pybliometrics.scopus.ScopusSearch(query, refresh=False, subscriber=True, view=None, download=True, integrity_fields=None, integrity_action='raise', verbose=False, **kwds)[source]

Interaction with the Scopus Search API.

Parameters:
  • query (str) – A string of the query.
  • refresh (bool or int (optional, default=False)) – Whether to refresh the cached file if it exists or not. If int is passed, cached file will be refreshed if the number of days since last modification exceeds that value.
  • subscriber (bool (optional, default=True)) – Whether the user accesses Scopus with a subscription or not. For subscribers, Scopus’s cursor navigation will be used. Sets the number of entries in each query iteration to the maximum number allowed by the corresponding view.
  • view (str (optional, default=None)) – Which view to use for the query, see https://dev.elsevier.com/sc_search_views.html. Allowed values: STANDARD, COMPLETE. If None, defaults to COMPLETE if subscriber=True and to STANDARD if subscriber=False.
  • download (bool (optional, default=True)) – Whether to download results (if they have not been cached).
  • integrity_fields (None or iterable (default=None)) – Iterable of field names whose completeness should be checked. ScopusSearch will perform the action specified in integrity_action if elements in these fields are missing. This helps avoiding idiosynchratically missing elements that should always be present, such as the EID or the source ID.
  • integrity_action (str (optional, default="raise")) – What to do in case integrity of provided fields cannot be verified. Possible actions: - “raise”: Raise an AttributeError - “warn”: Raise a UserWarning
  • verbose (bool (optional, default=False)) – Whether to print a downloading progress bar to terminal. Has no effect for download=False or when query file is in cache.
  • kwds (key-value parings, optional) – Keywords passed on as query parameters. Must contain fields and values listed mentioned in the API specification (https://dev.elsevier.com/documentation/SCOPUSSearchAPI.wadl), such as “field” or “date”.
Raises:
  • ScopusQueryError – For non-subscribers, if the number of search results exceeds 5000.
  • ValueError – If the view or the integrity_action parameter is not one of the allowed ones.

Examples

See https://pybliometrics.readthedocs.io/en/stable/examples/ScopusSearch.html.

Notes

The directory for cached results is {path}/{view}/{fname}, where path is specified in ~/.scopus/config.ini and fname is the md5-hashed version of query.

results

A list of namedtuples in the form (eid doi pii pubmed_id title subtype subtypeDescription creator afid affilname affiliation_city affiliation_country author_count author_names author_ids author_afids coverDate coverDisplayDate publicationName issn source_id eIssn aggregationType volume issueIdentifier article_number pageRange description authkeywords citedby_count openaccess fund_acr fund_no fund_sponsor). Field definitions correspond to https://dev.elsevier.com/guides/ScopusSearchViews.htm and return the values as-is, except for afid, affilname, affiliation_city, affiliation_country, author_names, author_ids and author_afids: These information are joined on “;”. In case an author has multiple affiliations, they are joined on “-” (e.g. Author1Aff;Author2Aff1-Author2Aff2).

Raises:ValueError – If the elements provided in integrity_fields do not match the actual field names (listed above).

Notes

The list of authors and the list of affiliations per author are deduplicated.

get_eids()[source]

EIDs of retrieved documents.

get_cache_file_age()

Return the age of the cached file in days.

get_cache_file_mdate()

Return the modification date of the cached file.

get_key_remaining_quota()

Return number of remaining requests for the current key and the current API (relative on last actual request).

get_key_reset_time()

Return time when current key is reset (relative on last actual request).

get_results_size()

Return the number of results (works even if download=False).