Citation trajectories

Posted on behalf of Greg Bothun (Physics)


Measuring impact of one’s peer reviewed publications is generally a highly subjective process and often it is assumed that if one publishes in the leading journals of their field, then those papers automatically have high impact.

Here I suggest a modest data driven process to remove this subjectivity and replace it with simple objective procedure that can be easily applied to Google scholars since the data is readily available.

This procedure basically involves using the time history of citations as an impact guide. The concept of ½ life is relevant here where after some paper is published there is some time scale over which ½ of its citations have occurred. Papers with longer ½ life, likely have had a larger impact. By way of example, we offer the following kinds of profiles:

  1. A high impact paper is one that has sustained referencing over time, at a quasi-constant level: Two example time histories of such papers are shown below (note that the form of the graph is more important than the actual citation numbers).

Paper I: Published in 2005 which does not show decaying reference behavior but fairly sustained referencing over 12 years:

Paper II: Published in 1986 which shows quasi-constant referencing over a 30 year period (yes it sucks to be old):

 

  1. Next example is a more rare form, which we will refer to as Pioneer impact. In this case the paper is not well acknowledged near the time of publication but has been rediscovered in later years.

In terms of numbers, the period 2000 through 2011 contains ½ of the total citation number (244), while the other ½ occurred over the 6 year period 2012-2017.  This is a qualitatively different citation history than the previous example.

Another example, this time on a small citation lifetime timescale, is shown below:

And here is an example of almost no impact early one, but the techniques described in the paper were discovered or emerged 10 years later:

  1. Finally there is the case of the average or Moderate Impact which has a decaying citation curve and a relatively short ½ life. Examples below:

For the latter case, although there is a long citation tail of low annual citations, ½ the citations were earned in a 5 year period near the time of publication.

 

  1. Finally there is the case of the Low Impact paper (like most of fine) – that has just a brief citation annual rate before decaying:

I am not advocating that any of the above procedure be used, I simply saying that information in addition a raw number is available and can be used in some way to better characterize impact.


 

Value-Driven Metrics?

Posted on behalf of Elliot Berkman (Psychology)


Following on the article “Applying the Yardstick, Department by Department” that Dr. Bothun recommended on this blog, I was inspired to think through what a more faculty-driven set of metrics might look like. The article quotes Bret Danilowicz, Dean of CAS as Oklahoma State, on the metrics system they implemented there:

Mr. Danilowicz thought there was a better way, starting with getting lower-level administrators, department heads, and faculty members to participate in the assessment process every year.

“To me, the president and provost are too high a level for this,” he says. “If the goal is to give your departments a chance to take a good look at themselves, and with an eye toward improvement, you need to get faculty and chairs more involved in the process.”

A marine biologist by training, Mr. Danilowicz had served as dean of science and technology at Georgia Southern University. Now he wanted to understand the wider range of disciplines he was sizing up at Oklahoma State. It was important, he thought, to develop qualitative measures, not just numbers that could be mashed up.

“I’m a scientist, so grants and publications were very important to me,” he says. But as he talked with chairs and professors outside the sciences, he saw that each discipline brought its own yardstick.

“I came to learn that people in humanities want to review the quality of their scholarship. And arts people value creativity, which is really hard to measure,” he says. “The more I learned, the more it seemed natural to have the departments develop their own criteria and do their own assessments, and for my office to give them my thoughts on what they come up with.”

This process could start with a broad set of values and principles that can be different for each department. For example, in Psychology, I might articulate some of our values as the pertain to scholarship as:

  • High quality, high impact research
  • Diversity, equity, and inclusion in our research
  • Professional  training and mentorship
  • Interdisciplinary, team science
  • Open and reproducible scholarship

Note that this is my personal articulation of some of our values and is not necessarily representative or universal. The set of values need to be a product of the entire department. I can imagine that having a series of high-level conversations within a department about what values it hopes to promote with its scholarship might be a useful and interesting exercise in its own right.

But, for now, let’s start with this initial set of values. How might these be translated into measurable variables? In psychology, for better or worse, the unit of scholarship is the peer-reviewed publication, typically in a journal. So, I can attempt to articulate ways that the values above could by translated into metrics on a per-paper level. For each paper, I can ask the following yes/no questions:

  • Is the paper published in what my department considers to be a high quality, high impact, journal?
  • Is the sample or authorship team diverse, equitable, and inclusive?
  • Is a graduate student or postdoc first-author (or, possibly, any author)?
  • Is the research team interdisciplinary, as defined by bridging subdisciplines or spanning fields?
  • Are the methods open and reproducible?

How would this work with actual papers? Here are a few recent papers from my lab with scores:

  • Cosme, D, *Mobasser, A., Zeithamova, D., Berkman, E.T., & Pfeifer, J.H. (in press). Choosing to regulate: Does choice enhance craving regulation? Social Cognitive and Affective Neuroscience.
    • High-quality journal? ✅ This journal is considered a top journal in my field.
    • Diverse sample or authorship team? ✅ The sample is not particularly diverse but the authorship team is.
    • Student or postdoc first author? ✅ Cosme is a graduate student in psychology. 
    • Interdisciplinary? ✅ The authors span 3 of the 4 areas within our department.
    • Open science? ✅ The data and code are publically available.
  • Berkman, E.T. (2018). Value-based choice: An integrative, neuroscience-informed model of health goals. Psychology & Health, 33, 40-57.
    • High-quality journal? Eh. It’s a niche journal but not what I’d consider top-tier.
    • Diverse sample or authorship team? Nope.
    • Student or postdoc first author? Nope.
    • Interdisciplinary? The content is sorta interdisciplinary, but the team is not.
    • Open science? This is a theory paper, but it is not open in the sense that it the journal is not open access.
  • Giuliani, N.R., Merchant, J.S., *Cosme, D., & Berkman, E.T. (in press). Neural predictors of eating behavior and dietary change. Annals of the New York Academy of Sciences.
    • High-quality journal? Moderate. This journal would probably not make a selective list.
    • Diverse sample or authorship team? ✅ Diverse authorship team.
    • Student or postdoc first author? No, Giuliani is faculty in the College of Education.
    • Interdisciplinary? ✅ Yes, Giuliani is in the COE.
    • Open science? ✅ This is a review, but the paper and some of the materials we used are publically available.

So what do I make of these ratings? As an author of these papers, this ordering (Cosme et al. > Giuliani et al. > Berkman) comports with my understanding of the “excellence” of these papers as I think of that term. Does this mean I think the Berkman (2018) paper is bad or low-quality? Not at all. In fact, I think there are some good ideas in that paper and that it might be influential in the field (a hypothesis I could test by watching how often it gets cited in the next few years). What is means is that the paper doesn’t advance my department’s values as much as the other two. I still get “credit” for it – it goes toward my publication count – but this system allows for a way to differentiate among my papers. The system prescribes a simple, moderately objective rubric for quickly assessing whether a paper promotes the values that I want to advance with my scholarship.

The incentives for our department are to publish papers that are in the journals we think are good; use diverse samples and authors; are authored by students and postdocs; are cross-disciplinary; and use open data and materials. I can game this system by publishing more papers like Cosme et al. Will I stop writing solo-authored theory pieces like the Berkman (2018) paper? No, because sometimes they’re fun and useful, and there’s not really a direct disincentive for me to write them (again, they still “count” and go on my CV and will be part of my p&t and merit review).

What would this process look like in practice? I can imagine that faculty score their own papers annually. When we send our CVs to CAS (which we already do each year), we could score all the new papers we published that year. Perhaps we could cap the score at some number, say 4, even if there are more than 4 values, so there are multiple ways for a paper to achieve the highest possible score. The departmental executive committee (or comparable), which already reviews files as part of merit reviews, could provide a sanity check on the scores produced by faculty.

At the department level, what we’d get is a count of total papers produced by the department that year, as well as an average values score for the papers in that department. Perhaps we could also supplement those metrics with some basic and readily available additional data such as citations and media mentions. Those decisions would be made at the department level.

In the end, the average values scores are not interpretable on their own. The would need to be contextualized primarily by year-over-year trends — as a department we want to see the scores go up next year — and possibly by similar data from comparator departments. We could gather that data for a small number of other departments ourselves, or, by making our process open and transparent, encourage other departments to start collecting themselves.


 

On the work of the university

Posted on behalf of Ken Calhoon (Comparative Literature)


February 27th, 2018

Dear Friends and Colleagues,

Mozart wrote forty-one symphonies, Beethoven only nine. I have written none, but I offer these thoughts on metrics. I apologize in advance for the naiveté, as well as the pathos.

On September 14th, at the beginning of the current academic year, University Provost and Senior Vice President Jayanth Banavar hosted a retreat for “academic leaders” in the EMU Ballroom. The highpoint of the assembly, in my view, was Jayanth’s own (seemingly impromptu) description of the research of David Wineland, the Nobel Laureate who recently joined the UO’s Department of Physics as a Knight Professor. In a manner that suggested that he himself must have been a gifted teacher, Jayanth provided a vivid and accessible account of Wineland’s signature accomplishment—speculative work aimed at increasing the computational speed of computers by “untrapping” atoms, enabling them to exist at more than one energy level at a time. With a humorous gesture to his own person, Jayanth ventured that it might be hard to imagine his body being in two rooms at once, but Wineland had figured out how, in the case of very small particles, this is possible. My own knowledge of quantum physics is limited to the few dismissive quips for which Einstein was notorious, e. g. “God is subtle but not malicious.” In any event, Wineland’s work was made to sound original and impressive. Equally impressive was the personable, humane and effective fashion in which Jayanth, with recourse to imagery and physical self-reference, sought to convey the essence of his fellow physicist’s work across all the disciplines represented in the room—and at the University.

I was inspired by the experience of seeing one person so animated by the work of another. However, my enthusiasm is measured today against the discouragement and disaffection that I and so many of my colleagues feel at the University’s current push, without meaningful debate, to metricize excellence—to evaluate our research in terms quite alien to the values our work embodies. As a department head with a long history at this institution, I must say that I feel helpless before the task of breaking our work down into increments and assigning numerical values to them. It can be done, of course, but the resulting currency would be counterfeit.

Over the course of my thirty-one-year career at the University of Oregon, I have presided over quite a few tenure and promotion cases and have been party to many more, both as departmental participant and as a member, for a two-year stint, of the Dean’s Advisory Committee in the College of Arts and Sciences. I am also routinely asked to evaluate faculty for tenure and promotion at other colleges and universities, where the process is more or less identical to ours. In past years I have been asked to write for faculty at Cornell, Harvard (twice), Johns Hopkins (twice), Washington University, University of Chicago, University of Pennsylvania, University of Minnesota (twice), Penn State, and Irvine, among others. I mention this not to boast—god forbid!—but to emphasize that institutions of the highest standing readily recruit faculty from the UO to assist in their internal decisions on professional merit and advancement.

For such decisions at the UO, department heads solicit evaluations from outside reviewers who are not only experts in the relevant field but are alsowell placed. They are asked to submit, along with their review, their own curriculum vitae and a biographical sketch. Reviewers are instructed to identify the most significant scholarly contributions which the individual under review has made, and to assess the impact of those contributions on the discipline. They are also asked to discuss the “appropriateness” of the publication venues, and also to “contextualize” their remarks with regard to common practices within the discipline or sub-field. They are asked to compare, “both qualitatively and quantitatively,” the work of the individual under review with that of other scholars in the field at comparable stages in their academic careers. Finally, the outside reviewers are asked to state whether the research record under consideration would meet the standards for tenure and promotion at their home institution. These instructions, which follow a template provided by Academic Affairs, differ little if at all from those I have received from other universities.

In response to these requests, we typically receive narratives, often three and four pages in length, in which reviewers—in accordance with the instructions but also with the conventions of professional service—not only discuss the candidate’s work in detail but also contextualize that work in relation, for example, to the evolving nature of the field, to others working on the same or similar material, not to mention the human content of that material. (I am usually asked to review the work of scholars working on the history of German literature and thought, as well as literary and film theory.) Looking back over the reports I have authored, I see that they contain phrases like “body of work,” “breadth of learning,” “intellectual energy,” “daunting command,” “surprising intervention,” “dazzling insight,” “staggering productivity,” etc. These formulations are subjective. As such, they are consistent with the process whereby one mind comes to grip with another. I am inclined to say that this process is particular to the humanities, but Jayanth Banavar’s lively and lucid presentation of David Wineland’s research would prove me wrong. It conveyed excitement.

What distinguishes the humanities from the sciences and many of the other, empirically oriented fields is that our disciplines are not consensus-based. We disagree among ourselves, often sharply, on questions of approach or method, on the validity and importance of the materials studied, on how arguments or interpretations should be structured or conceptualized. These disagreements may take place between departments at different universities, or within a single department. Disciplines within the humanities are in flux, and we suffer the additional burden of finding ourselves in a social and cultural world whose regard for humanistic work is markedly diminished. We often scramble to re-define our relevance while the ground shifts beneath our feet. To seek a stable set of ostensibly objective standards for measuring our work is to misrecognize the very essence of our work. These same standards risk becoming the instruments of this misrecognition.

In any case, the process of review for tenure and promotion, as formalized by Academic Affairs and by the more extensive guidelines which each unit has created, and for which each unit has secured approval both by its respective college and by Academic Affairs, already accounts for such factors as the stature of a press or journal, the rigor with which books and articles are reviewed, the quantity of publications balanced against their quality, and the impact which the faculty member’s research has had, or may be expected to have. But why the need to strip these judgments of their connective tissue? And for whom?

Curriculum vitae – “the course of [one’s] life.” When I was an undergraduate (at the University of Louisville, no less), I was greatly influenced by an historian of seventeenth-century Britain, Arthur J. Slavin. The dean of the college, he had been a friend of the mathematician Jacob Bronowski, recently deceased at the time, best known for his PBS series The Ascent of Man. One episode of the series begins with a blind woman carefully running her fingers over the face of an elderly, gaunt gentleman and speculating as to the hard course of his life. “The lines of his face could be lines of possible agony,” she says. The judgment is subjective, but accurate: The man, like Bronowski a Polish Jew, had survived Auschwitz, the remnants of which provide Bronowski with a physical backdrop for the dramatic and moving summation of an episode dedicated to the ramifications of the Principle of Uncertainty, which had been formulated by Werner Heisenberg just as all of Europe was about to fall victim to a despotic belief in absolute certainty. “It is said that science will dehumanize people and turn them into numbers. That is false: tragically false. Look for yourself…. This is where people were turned into numbers.”

I don’t mean to overdramatize the analogy, or even really to suggest one. I am more interested in Bronowski’s general statement that “[all] knowledge, all information between human beings, can only be exchanged within a play of tolerance. And that is true whether the exchange is in science, or in literature, or in religion, or in politics, or in any form of thought that aspires to dogma.” The dogma we are faced with today is that of corporate thinking, which is despotic in the sense that it mystifies. We in this country are inclined to think that people who have amassed great wealth know something we don’t—that they have the magic touch. It is from them and their public advocates that we hear the constant calls for governments, universities, prisons, hospitals, museums, utilities, national forests and parks to be run more like businesses. Why? (And which businesses? IBM? TWA? Pan Am? Bear Stearns? Enron? Wells Fargo?) Why is the business model the presumed natural guarantor of good organization? Why not a symphony? an eco-system? a cooperative? a republic? a citizenry? Why is the university not a model for business? Businesses certainly benefit from the talent we cultivate and send their way, outfitted with the knowledge, the verbal agility, the conceptual power that make up our stock in trade.

Our current national political scene presents us with constant images of promiscuous, self-reproducing wealth. Within this context, which is an extreme one, it is urgent that we as a collective make our case, and in terms commensurate with our self-understanding as researchers, thinkers, writers, fine artists, and teachers, not in terms that conform so transparently to the prevailing model of worker productivity.

Those who maintain that inert numbers are the only means we have for communicating our value have already been proven wrong by our own provost. I call upon our president, our provost and our many deans to bring their considerable talents, their public stature, as well as their commitment to the University, to bear on our cause. Many of us, I’m sure, are ready to support you.

With respect and thanks,

Ken Calhoon, Head

Department of Comparative Literature


 

Slides from Social Sciences + Humanities Metrics Panel

Posted on behalf of Spike Gildea


On Tuesday, February 27, Scott DeLancey (Linguistics), Spike Gildea (Linguistics), Volya Kapatsinski (Linguistics), Leah Middlebrook (Comparative Literature), Lanie Millar (Romance Languages), and Lynn Stephen (Anthropology) hosted a panel discussion of metrics for measuring our departmental research quality and the quality of our graduate programs. The panel briefly summarized work done in some of their departments to identify what they value in their own work, ways to measure how well they achieve goals they value, and how they could take leadership in moving comparator institutions towards identifying and measuring their goals in comparable ways.

The slides from DeLancey, Gildea, Middlebrook, Millar, and Stephen are here:

Slides from Kapatsinski on graduate metrics are here:


 

Useful resources for department-level metrics

Posted on behalf of Greg Bothun (Physics)


Its been my experience while at the UO that the UO often chooses to remain insular on many issues.

With respect to Performance Metrics as a way to evaluate Academic Departments there is a fairly robust literature on the Pros and Cons of this that I offer here, although I seriously doubt any of these references will be consulted:
What is the right way to measure departmental performance?
Performance measurement in academic departments:  the strategy map approach
Criteria and Metrics for “Academic Programs:” Degree/Certificate Programs and Academic Departments
This is the Boise State Comprehensive Model
Evaluating the performance of academic departments:  an analysis of research-related output efficiency
Herding Cats? Management and University Performanc
This is a broad overview of the economics of the matter and whether or not there is a demonstrable payoff
The 2016 AAUP Statement on “Academic Analytics” short, worth reading
Applying the Yardstick, Department by Department
I will stop there – as I said these are unlikely to be consulted by any community of scholars here at the UO.
Perhaps the most important thing is a decent articulation (this means one that is actually believed) of what the goal is in applying performance metrics, on a department basis, here at the UO.

Political Science Metrics

Posted on behalf of Craig Parsons (Head, Political Science).


Here is a summary of what PS has agreed to do, with a faculty vote last Thursday, on research metrics. Let me stress that I was very surprised that we were able to get to agreement—so maybe there are useful hints here for other departments that don’t have an obvious path forward at this point.

Our solution for journals/presses starts from our recently-renegotiated Course Load Adjustment policy. That system includes a “top” category of 8 journals and 14 presses that receive “bonus” points, and then gives “normal” points for all other peer-reviewed articles and books.

From there we decided that each faculty can nominate ONE more “top general” journal, to be added to those existing 8; and three journals of their choice for a “top specialty” category. There will be some overlap in their suggestions (which we know because we did a partial mock survey before this real one), so with our 20-ish faculty I’m going to bet we end up with 20-ish “top general” journals and 45-ish “top specialty.” Not the tightest list, but not ridiculous.

Then—the brilliant suggestion from my colleague Priscilla Yamin, which brought an overall deal within reach—we added an “interdisciplinary venues” category. We have a lot of people who do interdisciplinary research, and were having a lot of trouble with how to fit those options into the top general/top specialty tiers. For this third category, we decided that we would include a set of flagship journals in related areas, plus each faculty could name two other journals.

On presses, since we already began from a generous list of 14 “top” presses, we just asked everyone to name three “top specialty” presses. We specified that they must be university presses, which makes sense in political science.

For awards and grants, things were simpler—we’re mostly just shooting for a capacious list—and you can see what we did in the links below.

Of course there are another few details, but that’s the basic idea. It isn’t going to produce a beautiful hierarchical measure of quality, but no such measure would be acceptable to our department. This process sets fairly reasonable bounds on a deal to let everyone name the venues they respect the most.

NOTE: PLEASE DON’T ENTER ANYTHING IN THESE LINKS. THEY ARE FEEDING INTO OUR SURVEY. [VIEW ONLY]

Scholarly Output: https://blogs.uoregon.edu/polisci/wp-admin/admin-ajax.php?action=frm_forms_preview&form=mtury

Awards: https://blogs.uoregon.edu/polisci/wp-admin/admin-ajax.php?action=frm_forms_preview&form=wtjh1

Grants: https://blogs.uoregon.edu/polisci/wp-admin/admin-ajax.php?action=frm_forms_preview&form=tullh


 

Introducing the CAS Metrics Blog

This site is for open discussion of mission metrics by CAS departments. We encourage CAS departments to post updates about their progress as they put together their metrics templates. All are welcome to comment, though anonymous posting is not allowed.

The instructions from CAS via OPAA are here:

An example template is below. Note that the specific metrics listed here are examples only. CAS departments are encouraged to identify metrics that relate to their departmental mission and priorities.