Tyler Diorio, Patrick Joyce, Kobe Attias, Taki Koutsomitis, Gregor Zurowski, Brian Armstrong, Riccardo Goldoni, Jeffrey Koury
The purpose of our newly revamped Reputation v2 and Healthy Research Rewards is to provide new incentives that promote research content that is maximally reproducible. To this end, we’ve implemented two categories of incentives that we’ll dive deeper into below:
Proper documentation and assessment of reputation is an integral aspect of science. At ResearchHub we want to give users more context on who they interact with, who is making claims, and what their expertise is. REP v2 is surfaced most notably in the newly revamped Author Profile Pages which highlight the “Reputation” of a given author in their fields of expertise along with key statistics from their works, research achievements, and more.
Reputation is quantified across 248 research areas, termed “Subfields”. These Subfields are labeled using the OpenAlex topic classification system which leverages Scopus’ ASJC structure (see OpenAlex Topic Classification Whitepaper.docx). Through OpenAlex’s research information, we will be able to initialize Reputation and Author Profile Pages automatically for any researcher that verifies their identity, even one who has just joined ResearchHub for the first time.
Reputation is quantified by assigning point values to various actions and outcomes - these points are termed REP. We are using citations to initialize Reputation in a way that allows us to continually iterate towards better, more comprehensive metrics. We’ve opted to begin with citations alongside upvotes as it is important to recognize the existing corpus of researchers. These researchers have contributed to their fields through publication and peer reviewed research in addition to ResearchHub specific actions (i.e. upvotes). We’ve built the framework for REP using a composable structure that allows us plenty of room to be creative and continually add more nuance. Things we’re excited about considering in the future are: open access scores, reproducibility, cross-disciplinary work, innovation, and more. We’re excited to iterate on REP with the ResearchHub community over the next few years and continue to push towards our mission of accelerating the pace of scientific research.
There are now 4 tiers of Reputation for every subfield on ResearchHub. These tiers are calculated by sampling all of the original research articles and preprints within that subfield and processing their citation counts. It is important to consider that the supply of citations in any given subfield varies considerably when comparing against the other subfields. For example, the average number of citations in the “Molecular Biology” subfield is X per article or preprint, while the average number in the “Machine Learning” subfield is just Y per article or preprint. This is because the culture, standards, and pace of research across subfields are not uniform and require nuance.
We’ve attempted to account for this nuance by ranking authors within each subfield based on their citation count. Tier 1 researchers who have built Reputation solely through citations will be ranked in the 0-50th percentile of their subfield. Tier 2 researchers in the 50-70th percentile, Tier 3 in the 70-90th percentile, and Tier 4 in the 90-100th percentile. A value of REP is assigned per citation based on the supply of citations and log10 point values for tiers:
Reputation can be gained through actions on ResearchHub that result in upvotes or citations such as:
Upvotes: Users will earn at least 1 REP per upvote received.
Citations Users will earn a variable amount of REP per citation, depending on their subfield and their Reputation Tier within that subfield. The amount of REP per citation is calculated as follows in any given subfield:
This results in a uniquely dynamic REP value per citation for each subfield, as a function of the supply of citations available in the field. To help illustrate this, you can find three examples below from the subfields of Molecular Biology (high supply of citations), Artificial Intelligence (medium supply of citations), and Philosophy (low supply of citations):
Here are three examples of the calculation of REP earned per citation using the Molecular Biology, Artificial Intelligence, and Philosophy subfield:
The most cited researcher in Molecular Biology has 312,048 citations in original research articles or preprints.
Te most cited researcher in Artificial Intelligence has 147,078 citations in original research articles or preprints.
The most cited researcher in Philosophy has 16,097 citations in original research articles or preprints.
These values of REP per citation are likely to change in the future as we continue iterating on the REP metric. We anticipate adjusting the balance of REP earned per upvote compared against citations in the medium term and will rely strongly on the feedback provided by the ResearchHub community.
Reputation tiers will eventually serve multiple purposes on the platform - as a starting point, REP tiers will provide:
*Note: Users who have previously verified their accounts before July 31, 2024 will need to verify again to establish these benefits.
Beyond reputational incentives, we want to financially reward researchers who have participated in healthy research behaviors including:
To this end, we have created author profiles for all researchers, and allowed 1st authors to claim RSC rewards for publications based on citation counts filtered through Open Access, Open Data, and Preregistration. The amount of RSC distributed per publication is dependent on the 1) subfield of the publication 2) number of citations and 3) the degree of “openness”, including open access, preregistration, and open data status.
*It is important to note that ResearchHub has implemented these financial incentives in a way that allows for and encourages continuous iterations/improvement. All of the structures described below are upgradable; please continue to provide us with feedback as we launch these new rewards.
Over the past 5 years of ResearchHub, the reward algorithm has under-emitted a total of 105M RSC. This RSC is set aside to be specifically used for incentivizing behavior that accelerates the pace of science - including actions that increase reproducibility. This provides a significant and unique opportunity to retrospectively reward researchers who have been practicing healthy research behaviors over the years. The challenge of establishing an initial condition for divvying up RSC across research disciplines is non-trivial. As a first pass, we’ve opted to distribute RSC as a function of the number of papers and number of citations in all subfields. We’ve used the date of the inception of ResearchHub (5 years ago - August 1st 2019), to the present day to represent this mission.
To this end, we calculated the total number of citations and total number of papers in each of the 248 subfields over the past 5 years. This provides us with a percentage value of the amount of citations and publications within a given subfield compared to the global research ecosystem. Given that both metrics alone would be insufficient for allocation, we opted to average these percentage weights to compute the RSC distributions. Final RSC distributions across all 248 subfields can be seen here: Healthy Rewards Per Subfield.xlsx as well as visually in the pie chart below.
Choosing the ideal distribution of healthy research rewards within a subfield is also non-trivial and deserving of nuance. As an initial approach, we’ve chosen to use a Zipf’s law rank-based ordering system where publications within a given subfield are first ranked from 1 to N, where 1 is the most highly cited paper. You can find a step-by-step walkthrough of the calculations below.
*The "multipliers" for open data and preregistration status are additive as shown above. For example, having open data and a preregistration attached to an open access article would result in 6x base rewards (open access: 2000 RSC base reward, open data: +3*2000 RSC (+6000 RSC), preregistration: +2x*2000 RSC (+4000 RSC, for a total of 12000 RSC)
As we’ve greatly enriched the data that underlies works on ResearchHub, we’re now able to provide rewards for more actions in scientific research. We will be adding the capability to distribute RSC rewards on citations moving forward in addition to upvotes. The exact details of this implementation are still being built but we’d love to hear your feedback about the best way to do this. Some open questions we have are:
We are eager to release these new incentive structures on the ResearchHub platform and are looking forward to hearing critical feedback from our community. It is important that we make sure these rewards are incentivizing healthy research behavior. In the words of the great American biochemist, Linus Pauling:
"If you want to have good ideas you must have many ideas. Most of them will be wrong, and what you have to learn is which ones to throw away."- Linus Pauling
We will continue to iterate on these models for the foreseeable future in hopes of stepping closer towards a better future for scientific research.
If you’d like to chat with members of the ResearchHub and ResearchHub Foundation team, consider connecting with us in the ResearchHub Discord or reach out to us via X/Twitter, LinkedIn, or email.
ResearchHub
ResearchHub Foundation