pybliometrics.scival.TopicLookupMetrics¶
TopicLookupMetrics() implements the metrics endpoint of the SciVal TopicLookup API.
It accepts one or more SciVal Topic IDs as the main argument and retrieves various performance metrics for the specified topics.
Documentation¶
- class pybliometrics.scival.TopicLookupMetrics(topic_ids, metric_types=None, by_year=False, refresh=False, **kwds)[source]¶
Interaction with the SciVal’s metrics endpoint of the TopicLookup API.
- Parameters:
topic_ids (
str|list) – SciVal Topic ID(s). Can be a single ID or comma-separated string of IDs, or a list of IDs (e.g. [1516, 1517]).metric_types (
str|list|None, optional) – Metric type(s) to retrieve. Can be a single metric or comma-separated string, or a list. Available metrics are: AuthorCount, CitationCount, CorePapers, FieldWeightedCitationImpact, InstitutionCount, MostRecentlyPublishedPapers, RelatedTopics, ScholarlyOutput, TopAuthors, TopCitedPublications, TopInstitutions, TopJournals, TopKeywords. If not provided, all metrics are retrieved.Default:Noneby_year (
bool, optional) – Whether to retrieve metrics broken down by year. Note: Some metrics are not available by year: CorePapers, MostRecentlyPublishedPapers, RelatedTopics, TopAuthors, TopInstitutions, TopJournals, TopKeywords.Default:Falserefresh (
bool|int, optional) – Whether to refresh the cached file if it exists or not. If int is passed, cached file will be refreshed if the number of days since last modification exceeds that value.Default:Falsekwds (
str) – Keywords passed on as query parameters. Must contain fields and values mentioned in the https://dev.elsevier.com/documentation/SciValTopicAPI.wadl.- property topics: list[Topic]¶
A list of namedtuples representing topics and their basic info in the form (id, name, uri, prominencePercentile, scholarlyOutput).
- property AuthorCount: list[MetricData] | None¶
Author count metrics for each topic. Returns list of MetricData namedtuples with structure: (entity_id, entity_name, metric, year, value, percentage, threshold),
- property CitationCount: list[MetricData] | None¶
Citation count metrics for each topic. Returns list of MetricData namedtuples with structure: (entity_id, entity_name, metric, year, value, percentage, threshold),
- property CorePapers: list[CorePaper] | None¶
Core papers for the topic. Returns list of CorePaper namedtuples with structure: (entity_id, entity_name, publication_id).
- property FieldWeightedCitationImpact: list[MetricData] | None¶
Field weighted citation impact metrics for each topic. Returns list of MetricData namedtuples with structure: (entity_id, entity_name, metric, year, value, percentage, threshold).
- property InstitutionCount: list[MetricData] | None¶
Institution count metrics for each topic. Returns list of MetricData namedtuples with structure: (entity_id, entity_name, metric, year, value, percentage, threshold).
- property MostRecentlyPublishedPapers: list[RecentPaper] | None¶
Most recently published papers for the topic. Returns list of RecentPaper namedtuples with structure: (entity_id, entity_name, publication_id).
- property RelatedTopics: list[RelatedTopic] | None¶
Related topics for the topic. Returns list of RelatedTopic namedtuples with structure: (entity_id, entity_name, related_topic_id, related_topic_name, related_topic_uri, prominencePercentile, relationScore, relatedTopicRank).
- property ScholarlyOutput: list[MetricData] | None¶
Scholarly output metrics for each topic. Returns list of MetricData namedtuples with structure: (entity_id, entity_name, metric, year, value, percentage, threshold).
- property TopAuthors: list[TopAuthor] | None¶
Top authors for the topic. Returns list of TopAuthor namedtuples with structure: (entity_id, entity_name, author_id, author_name, publicationCount).
- property TopCitedPublications: list[TopPublication] | None¶
Top cited publications for the topic. Returns list of TopPublication namedtuples with structure: (entity_id, entity_name, publication_id, citationCount).
- property TopInstitutions: list[TopInstitution] | None¶
Top institutions for the topic. Returns list of TopInstitution namedtuples with structure: (entity_id, entity_name, institution_id, institution_name, publicationCount).
- property TopJournals: list[TopJournal] | None¶
Top journals for the topic. Returns list of TopJournal namedtuples with structure: (entity_id, entity_name, journal_id, journal_name, publicationCount, citationCount, authorCount, publicationGrowth, authorGrowth, sjr, snip, citeScore).
- property TopKeywords: list[TopKeyword] | None¶
Top keywords for the topic. Returns list of TopKeyword namedtuples with structure: (entity_id, entity_name, keyword_name, keyword_uri, weight, relevance, publicationCount, publicationGrowth).
- get_cache_file_age()¶
Return the age of the cached file in days.
- Return type:
int
- get_cache_file_mdate()¶
Return the modification date of the cached file.
- Return type:
str
- get_key_remaining_quota()¶
Return number of remaining requests for the current key and the current API (relative on last actual request).
- Return type:
str | None
- get_key_reset_time()¶
Return time when current key is reset (relative on last actual request).
- Return type:
str | None
Examples¶
You initialize the class with one or more SciVal Topic IDs. The argument can be a single ID, a list of IDs, or a comma-separated string of IDs.
>>> from pybliometrics.scival import TopicLookupMetrics, init >>> init() >>> topic_metrics = TopicLookupMetrics("2782")
You can obtain basic information just by printing the object:
>>> print(topic_metrics) TopicLookupMetrics for 1 topic(s): - Enhancing Reproducibility through Open Science Practices (ID: 2782)
There are many properties available that provide different types of metrics. You can explore the available topics:
>>> topic_metrics.topics [Topic(id=2782, name='Enhancing Reproducibility through Open Science Practices', uri='Topic/2782', prominencePercentile=99.044914, scholarlyOutput=2455)]
Properties with MetricData
Properties like CitationCount return a list of MetricData namedtuples with the structure: (entity_id, entity_name, metric, year, value, percentage, threshold) where entity_id and entity_name refer to the topic.
>>> topic_metrics.CitationCount [MetricData(entity_id=2782, entity_name='Enhancing Reproducibility through Open Science Practices', metric='CitationCount', year='all', value=32637, percentage=None, threshold=None)]
Other properties that also retruen MetricData include AuthorCount, FieldWeightedCitationImpact, InstitutionCount, and ScholarlyOutput.
Specialized properties
Properties like TopAuthors return specialized namedtuples with different structures.
>>> topic_metrics.TopAuthors[:3] [TopAuthor(entity_id=2782, entity_name='Enhancing Reproducibility through Open Science Practices', author_id=6602573237, author_name='Vazire, Simine', publicationCount=29), TopAuthor(entity_id=2782, entity_name='Enhancing Reproducibility through Open Science Practices', author_id=57226848754, author_name='Ioannidis, John P.A.', publicationCount=28), TopAuthor(entity_id=2782, entity_name='Enhancing Reproducibility through Open Science Practices', author_id=23984790800, author_name='Dreber, Anna', publicationCount=22)]
Other specialized properties include CorePapers, MostRecentlyPublishedPapers, RelatedTopics, TopCitedPublications, TopInstitutions, TopJournals, and TopKeywords.
Concatenating Metrics
You can concatenate properties with MetricData into a single DataFrame for easier analysis.
>>> import pandas as pd >>> >>> data = [] >>> data.extend(topics_metrics.CitationCount) >>> data.extend(topics_metrics.ScholarlyOutput) >>> df = pd.DataFrame(data) >>> df.head()
entity_id entity_name metric year value percentage threshold 0 9350 Molecular Catalysts for Hydrogen Production Ad... CitationCount all 10707 None None 1 11084 Machine Learning Potentials in Molecular Simul... CitationCount all 66222 None None 2 9350 Molecular Catalysts for Hydrogen Production Ad... ScholarlyOutput all 809 None None 3 11084 Machine Learning Potentials in Molecular Simul... ScholarlyOutput all 2405 None None Multiple Topics
You can analyze multiple topics simultaneously and retrieve metrics by_year:
>>> multi_topics = TopicLookupMetrics(["9350", "11084"], by_year=True) >>> print(multi_topics) TopicLookupMetrics for 2 topic(s): - Molecular Catalysts for Hydrogen Production Advances (ID: 9350) - Machine Learning Potentials in Molecular Simulations (ID: 11084)
Properties can always be converted to a DataFrame for easier analysis.
>>> df = pd.DataFrame(multi_topics.ScholarlyOutput) >>> df.head()
entity_id entity_name metric year value percentage threshold 0 2782 Enhancing Reproducibility through Open Science Practices ScholarlyOutput 2024 580 None None 1 2782 Enhancing Reproducibility through Open Science Practices ScholarlyOutput 2020 394 None None 2 2782 Enhancing Reproducibility through Open Science Practices ScholarlyOutput 2021 462 None None 3 2782 Enhancing Reproducibility through Open Science Practices ScholarlyOutput 2022 480 None None 4 2782 Enhancing Reproducibility through Open Science Practices ScholarlyOutput 2023 539 None None Downloaded results are cached to expedite subsequent analyses. This information may become outdated. To refresh the cached results if they exist, set refresh=True, or provide an integer that will be interpreted as the maximum allowed number of days since the last modification date. For example, if you want to refresh all cached results older than 100 days, set refresh=100. Use topic_metrics.get_cache_file_mdate() to obtain the date of last modification, and topic_metrics.get_cache_file_age() to determine the number of days since the last modification.