pybliometrics.sciencedirect.ObjectMetadata¶
All components of a document that are not text (figures, formulas, etc.) are called objects. ObjectMetadata() retrieves the metadata associated with all objects in a document. Specifically, it retrieves the metadata from the ScienceDirect Object Retrieval API.
Documentation¶
- class pybliometrics.sciencedirect.ObjectMetadata(identifier, view='META', id_type=None, refresh=False, **kwds)[source]¶
Class to retrieve a the metadata of all objects of a document.
Class to retrieve the metadata of all objects of a document.
- Parameters:
identifier (
int|str) – The indentifier of an article.view (
str, optional) – The view of the object. Allowed value: META.Default:'META'id_type (
str|None, optional) – The type of identifier supplied. Allowed values: doi, pii, scopus_id, pubmed_id, eid.Default:Nonerefresh (
bool|int, optional) – Whether to refresh the cached file if it exists. Default: False.Default:Falsekwds (str)
- property results: list[Metadata]¶
Metadata of the objects in a document. List of namedtuples in the form eid, filename, height, mimetype, ref, size, type, url and width.
- get_cache_file_age()¶
Return the age of the cached file in days.
- Return type:
int
- get_cache_file_mdate()¶
Return the modification date of the cached file.
- Return type:
str
- get_key_remaining_quota()¶
Return number of remaining requests for the current key and the current API (relative on last actual request).
- Return type:
str | None
- get_key_reset_time()¶
Return time when current key is reset (relative on last actual request).
- Return type:
str | None
Examples¶
To use the class provide a valid identifier:
>>> import pandas as pd >>> from pybliometrics.sciencedirect import ObjectMetadata, init >>> init() >>> om = ObjectMetadata('10.1016/j.neunet.2024.106632')
The results property contains a list of the metadata of the objects found. The available fields and a description can be found in the Object Retrieval Views:
>>> om.results [Metadata(eid='1-s2.0-S0893608024005562-gr3.jpg', filename='gr3.jpg', height=729, mimetype='image/jpeg', ref='gr3', size=100202, type='IMAGE-DOWNSAMPLED', url='https://api.elsevier.com/content/object/eid/1-s2.0-S0893608024005562-gr3.jpg?httpAccept=%2A%2F%2A', width=656), Metadata(eid='1-s2.0-S0893608024005562-gr5.jpg', filename='gr5.jpg', height=256, mimetype='image/jpeg', ref='gr5', size=44240, type='IMAGE-DOWNSAMPLED', url='https://api.elsevier.com/content/object/eid/1-s2.0-S0893608024005562-gr5.jpg?httpAccept=%2A%2F%2A', width=623), Metadata(eid='1-s2.0-S0893608024005562-gr4.jpg', filename='gr4.jpg', height=246, mimetype='image/jpeg', ref='gr4', size=51563, type='IMAGE-DOWNSAMPLED', url='https://api.elsevier.com/content/object/eid/1-s2.0-S0893608024005562-gr4.jpg?httpAccept=%2A%2F%2A', width=376), ...]
Note that each object is uniquely identified by its EID, which is a concatenation of the document’s EID and the object’s filename.
The results can be casted to a pandas DataFrame:
>>> df = pd.DataFrame(om.results) >>> # Print retrieved fields >>> df.columns Index(['eid', 'filename', 'height', 'mimetype', 'ref', 'size', 'type', 'url', 'width'], dtype='object') >>> # Get shape of the dataframe (rows x columns) >>> df.shape (355, 9) >>> # Print the first 5 rows >>> df.head()
eid filename height mimetype ref size type url width 0 1-s2.0-S0893608024005562-gr3.jpg gr3.jpg 729.0 image/jpeg gr3 100202 IMAGE-DOWNSAMPLED https://api.elsevier.com/content/object/eid/1-... 656.0 1 1-s2.0-S0893608024005562-gr5.jpg gr5.jpg 256.0 image/jpeg gr5 44240 IMAGE-DOWNSAMPLED https://api.elsevier.com/content/object/eid/1-... 623.0 2 1-s2.0-S0893608024005562-gr4.jpg gr4.jpg 246.0 image/jpeg gr4 51563 IMAGE-DOWNSAMPLED https://api.elsevier.com/content/object/eid/1-... 376.0 3 1-s2.0-S0893608024005562-gr6.jpg gr6.jpg 246.0 image/jpeg gr6 53955 IMAGE-DOWNSAMPLED https://api.elsevier.com/content/object/eid/1-... 376.0 4 1-s2.0-S0893608024005562-gr2.jpg gr2.jpg 729.0 image/jpeg gr2 98000 IMAGE-DOWNSAMPLED https://api.elsevier.com/content/object/eid/1-... 656.0