How to access Scopus¶
To access Scopus via its API, you need two things. First, your institution needs to be a subscriber (not only to Scopus, but really to its API); second, you need to register API keys at https://dev.elsevier.com/apikey/manage. For each profile, you may register 10 keys.
The Scopus API recognizes you as a member of your institution via IP range. For working from home, Scopus can also grant InstTokens. Thus one of three things needs to happen:
You are in your institution’s network
You use your institution’s VPN
You use an InstToken
Option 1 is easy and the most common.
Option 2 might require you to additionally set a proxy. You can do so in the configuration file.
Option 3 is rare. An InstToken is provided directly by Scopus/Elsevier to allow remote access in the absence of a VPN. It is cupled directly to a key. If you have an InstToken, please provide it during the setup when pybliometrics prompts you for it. Alternatively, add it to the configuration file manually. You may also set the InstToken via insttoken=”XYZ” in any class. This is the preferred solution if you possess multiple keys.
There are only three Scopus APIs that you can access without your institution subscribing to it: The Abstract Retrieval API, the Scopus Search API and the Subject Classifications API.
As a non-subscriber, use view=META in the AbstractRetrieval() class. To search for for documents via the Scopus Search API as a non-subscriber, set subscriber=False in the ScopusSearch() (you will retrieve less information however). The Subject Classifications API is the same for subscribers and non-subscribers.
API Key quotas and 429 error¶
Each API key has a certain usage limit for different Scopus APIs which are reset weekly. See https://dev.elsevier.com/api_key_settings.html for the list; for example, a key allows for 5,000 retrieval requests, or 20,000 search requests via the Scopus Search API.
The usage limits for each key are reset weekly, one week after their first usage. To this end, each class has two methods that can help you: .get_key_remaining_quota() tells you how many calls you have left with the current key for the last used API. .get_key_reset_time() tells you the time until reset.
pybliometrics will use all the keys provided in the configuration file when one key exceeded its quota for the given API. Be sure to put all keys in the config.ini.
When the last key has been depleted as well, pybliometrics throws an a pybliometrics.scopus.exception.Scopus429Error. In this case you need to restart the application one week after it has been started.
Error message hierarchy¶
pybliometrics raises exceptions if the download status is not ok. To allow for error-specific handling, pybliometrics employs the following exception hierarchy:
pybliometrics.scopus.exception.ScopusException: Base class for the following exceptions.
pybliometrics.scopus.exception.ScopusQueryError: When a search query returns more results than specified or allowed (Scopus allows 5000 results maximum). Change the query such that less than or equal to 5000 results are returned.
pybliometrics.scopus.exception.ScopusHtmlError: Base class for the following exceptions raised through the requests package.
pybliometrics.scopus.exception.Scopus400Error: BAD REQUEST: Usually an invalid search query, such as a missing parenthesis. Verify that your query works in Advanced Search.
pybliometrics.scopus.exception.Scopus401Error: UNAUTHORIZED: Either the provided key is not correct, in which case you should change it in your configuration file, or you are outside the network that provides you access to the Scopus database (e.g. your university network). Remember that you need both to access Scopus.
pybliometrics.scopus.exception.Scopus404Error: NOT FOUND: The entity you are looking for does not exist. Check that your identifier is still pointing to the item you are looking for.
pybliometrics.scopus.exception.Scopus413Error: The request entity is too large to be processed by the web server. Try a less complex query.
pybliometrics.scopus.exception.Scopus414Error: TOO LARGE: The query string you are using is too long. Break it up in smaller pieces.
pybliometrics.scopus.exception.Scopus429Error: QUOTA EXCEEDED: Your provided API key’s weekly quota has been depleted. If you provided multiple keys in your configuration file, this means all your keys are depleted. In this case, wait up to week until your API key’s quota has been reset.
pybliometrics.scopus.exception.ScopusServerError: General exception related to all Server-related exceptions defined below. This may happen for various reasons (the internet is a noisy medium); usually it helps to wait few seconds before the next query. Server errors are also raised if you use a non-existent fieldname in searches. Verify that your query works in Scopus’ Advanced Search. Previously pybliometrics used more fine-grained exceptions in the 5xx space, namely “Scopus500Error”, “Scopus502Error” and “Scopus504Error”. These are deprecated, use “ScopusServerError” instead.
If queries break for other reasons, exceptions of type requests.exceptions are raised, such as:
- requests.exceptions.TooManyRedirects: Exceeded 30 redirects.
The entity you are looking for was not properly merged with another entity, in the sense that it is not correctly forwarding. Happens rarely when Scopus Author profiles are merged. May also occur less often with Abstract EIDs and Affiliation IDs.
pybliometrics will automatically try to establish the connection a few times on typical server-side errors. The number of retries is specified in your configuration file, section “Requests” value “Retries” (if none is given, pybliometrics makes 5 attempts).