Ratings — Concerts vs. Media Reviews


2020-12-06 — Original posting

Bewertungen (“Benotungen”) in Besprechungen von Konzerten und Medien — Zusammenfassung

Dieser Text versucht, die Benotung von CD-Aufnahmen und Konzertaufführungen zu erklären. Wie kommen die Stern-Bewertungen zustande, was sollen sie bedeuten: Hintergrund, Bewertungskriterien, aber auch Aspekte wie Subjektivität vs. Objektivität (so es diese überhaupt gibt), Systematik, Kohärenz, Limitierungen.

Table of Contents

Introduction: Ratings in Music

0.5-star rating


Strictly speaking, rating (especially numeric rating) in classical music is nonsense: music performance cannot be measured. Sure, one could find out whether an artist (or orchestra) follows the score exactly, and assign a “perfect” performance a top rating. But would you then lower that rating based on the number and severity of any “errors” made? Moreover, such a “perfect” performance would be absolutely dead in the listener’s ear. One could just as well have a computer / synthesizer do the performance. The result would be “perfect”, but really as dead as a machine.

On top of that, musical notation is not “exact science” (maybe with the exception of some compositions from the past decades, written for computer & synthesizer), i.e., the composer only writes down his intent / ideas to the degree and accuracy / precision that the notation allows. Often, composers implicitly rely upon conventions. These may get lost or altered with time. Over the last century, composers have tried hard to be more / very specific and detailed in their notation. However, there is still a human factor involved in the performance.

1-star rating


Music lives from the interpretation of the score / the notation, i.e., it conveys the artist’s personal view / feelings / emotions, and on the recipient’s side it involves the listener’s “receptivity”, with aspects such as familiarity with the style and genre, the willingness to let the music speak to one’s heart, and similar aspects. With this, ratings are even more doubtful. At least in the sense that they cannot possibly be objective, but are unavoidably subjective. Overall, ratings really are questionable, if not utter nonsense—at least in a generic sense, i.e., without additional background information.

What do I mean with the “additional information”? I think that once you get to know a reviewer, his/her taste, preferences, musical background, ratings can indicate whether and how much reviewers like a given performance or recording, how much they think one performance is superior / preferable to another one, and the like. In other words, and considering this blog: if I had posted just one “naked” review, and nothing else, that would be absolutely use- and worthless. However, in the context of all the other reviews, a listener can get to know how my musical mind is “ticking”, and this should permit putting ratings into a context, which may make them useful.

Despite what I just stated, I’m still using ratings in (most of) my reviews. Below, I’m explaining my intent with these ratings in the case of concert reviews (i.e., live experiences), as well as in media reviews (i.e., recorded, “frozen / preserved” music performances).

1.5-star rating

Stars vs. Numbers — Scale and Gradation

A number of rating schemes are in use. I had to make up my mind about selecting the one that suits me best. Here are the most common options:

  • .. ★★★★★. This is probably the most common scheme: simple, easy. The main disadvantage is that it offers little differentiation. Especially when the ratings predominantly fall into 3, 4, or 5 stars only. It is “hardcoded” into Apple Music (the software that I use to keep my digital library). So, naturally, I use it to rate individual tracks / movements in media reviews, see below. It’s also the most frequently used rating method on feuilleton pages in print media. And while I was doing concert reviews for Bachtrack.com, I was forced to use this scheme for their reviews.
  • Numeric, 1 .. 10. Used less often, but offers twice the differentiation compared to the five-star rating. This can be refined with fractional values, i.e., one digit behind the decimal point (1.0 .. 10.0).
  • Numeric, 1 .. 100. Rarely used, but offers the most differentiation (same as 1.0 .. 10.0). The main place where I see this scheme being used is in competitions, where jury members need to rate the performances of dozens or more (good) candidates with enough differentiation to permit electing the winner, and the top 3 – 10 performers.
  • .. ★★★★★ with half-star intermediates. One can see this as equivalent of a numeric rating with values between 1 and 10, with the advantage of slightly better visibility, i.e., being more eye-catching.

Selection for this Blog

As I continue to use Apple Music (formerly iTunes) as platform for my music collection, using one of the above star schemes is a given for media reviews. I’m rating tracks with entire star values (that’s all Apple Music permits). For results (averages of track ratings), I use half-star steps for additional differentiation, or occasionally rather numeric results, giving a range of 50 steps of 0.1 (stars).

For concert reviews (live performances), I feel that a five-star range with half-star steps yields enough differentiation. I may occasionally deviate from that by using decimal notation, but still in the range 1.0 .. 5.0 (I could of course go down to 0.0, but I hope never to encounter such performances!).

I have also explained my idea about the five-star (★ .. ★★★★★) rating scheme in my “Typography” page. This primarily applies to concert reviews. Let me quote the relevant list from that page:

  • ★★★★★ — Stellar, “once-in-a-century” performance or listening experience (or at least close to that)
  • ★★★★½
  • ★★★★ — Excellent performance
  • ★★★½
  • ★★★ — Good, standard/average, or moderate performance
  • ★★½
  • ★★ — Clearly substandard performance: I have major objections, etc.
  • ★½
  • ★ — A listening experience that I definitely don’t want to repeat or an appalling performance / really odd interpretation.

For more details see below.

2-star rating

Rating Consistency / “Validity”

Given that very few aspects of a music performance are actually measurable, ratings in music reviews (media or concert) strictly cannot be universally valid: there is no such thing as a universal scale. Ratings always must be viewed in their context, i.e., typically (at least) the publication medium (e.g., a given newspaper or Website), or even better the reviews of a given journalist.

There may be differences in how journalists apply ratings. Editors don’t have the means of correcting individuals rate concerts or media (unless they had been present at a given concert). In the end, how one reads ratings (and how and whether one can trust them) comes down to how much one trusts a given medium, and/or (better) a given journalist.

Consistency and Trustworthiness

My goal as critic is to appear trustworthy, i.e., I would like people to trust my reviews and their ratings. With this I don’t imply that readers necessarily agree with my critiques. I don’t mean to impose my opinions on the readers. Everyone is of course allowed his/her own opinion. I want to be trustworthy in the sense that visitors get to know my personal, sincere opinion / judgement on pieces. And it is consistency throughout the blog which hopefully makes readers trust my test as being my true personal view.

One of the reasons for starting my blog with (seemingly trivial) biographic notes was, for people to learn where I’m “coming from”, to get to know the author. The large number of reviews that I posted so far also allows the visitor to look at reviews of similar music and/or performances (concerts or media). This will tell about my personal taste, i.e., where my personal preferences lie.

In other words: it is only my existing posts which give my music / performance ratings a meaning. For this to work, it is prerequisite that my ratings are both justified (at least on my part) and consistent. The “justified” part is given through my (usually extensive) comments. To some degree, how I (try to) achieve consistency & coherence in my ratings depends on the type of review: media vs. concerts, known music vs. new or unknown territory, see below. Furthermore: of course, my preferences, my taste evolve both over time, as well as with the experience from concerts and new recordings. Hence, consistency does not imply “fixed for a lifetime”: returning to existing media reviews after years is likely to result in revised ratings.

My Ratings in Media Reviews

Media reviews are different from concert reviews. In the former, the artist’s “interaction” with the listener is far more indirect (no eye contact, no concert atmosphere, no audience). Hence, one might suspect that these are less subject to visual / “human interaction bias” (which, of course, forms an integral part of a concert experience), see also below.

In media reviews, there are some subtle differences between ratings for single, individual recordings, and reviews that compare several recordings:

listening diary posts

Reviews Presenting a Single Recording

Reviews presenting a single recording (e.g., CDs that were sent to me for reviewing, or many of my “Listening Diary” posts) follow a scheme that tries to be “absolute” and consistent with that in concert reviews, as outlined above. In “Listening Diary” posts, I’m feeling most “independent” and try hard applying an “absolute” rating scale. These posts are about recordings in my collection, and so, I naturally hope that there are few “bad” recordings only. If a recording still turns out “bad” (in my judgement, for whatever reason), I may leave out the rating—or not review the recording at all.

Media Submissions

With media that I receive for reviewing, I need to be a little careful. I don’t want to hurt the submitter (artist, agency) by giving a really bad review. I’d rather refuse to review a recording. For the same reason, I don’t promise reviews of media that are sent to me (especially if this was not solicited). In a recent case, an agency gave me a preview of a recording, and on the phone we discussed my initial findings, and we concluded that I would not review. My review would have been OK, but certainly not stellar or enthusiastic. And not really useful to promote an artist’s career.

I don’t mean to be over-critical. Even if a recording is not entirely stellar, I can and will still try looking for positive aspects that I find noteworthy. However, this must remain in the context of an overall (ideally comprehensive) review. I don’t want to “distort” my reviews just to be nice to the submitter, and I want to maintain consistency across my reviews.

2.5-star rating

Media Comparison Posts

Comparison posts are slightly different. I try retaining internal consistency, in that, e.g., the ratings for a given movement don’t depend on those of other movements. However, otherwise, I will typically use a larger rating range. In practice, that’s often ★★ (rarely below) up to ★★★★★ for the best performance in the comparison. This is an attempt to retain enough differentiation. Keep in mind that Apple Music does not allow for “half star” values.

These ratings may be somewhat independent of what a “universal” scaling would be. In other words: in the absence of obvious deficiencies, the best recording in a comparison is likely to receive a ★★★★ or ★★★★★ rating. Consequently, when I later add a recording that feels “better”, I may need to re-scale some or all of the existing ratings, i.e., lower all existing values, or compress their range.

In compensation for the limited number of rating steps (in Apple Music), I will typically present overall comparison results (e.g., the average of all ratings for a given composition in a given recording) in numeric form, with a fractional number with 1 or 2 digits after the decimal point (or at least using half-star intermediate values).

Rating Criteria in Media Reviews

Reviews of Individual / Single Media

There are two principal cases:

Unknown / New Music

In the case of new music, music that is unknown to me, and/or music for which I don’t have access to the score (as is almost always the case with new music), I try applying the following criteria:

  • I try rating performance criteria, such as coordination, clarity of articulation, agogics / rubato, vibrato, dynamics, phrasing (to the degree that this is possible without reference / score, etc.)
  • Does the tempo, the character of the performance (along with articulation, etc.) match the annotation of the music?
  • To me, the quality of the recording in general is of lesser importance—though muffled, muddy sound certainly does not help.
  • Badly sounding instruments ar another matter, though: twanging piano strings, bad tuning, scratching string instruments, the sound of instrument groups in orchestras, their transparency & coordination are key factors I’m taking into consideration.
  • Inevitably, my emotional response will play a key role, i.e., whether I “like” and enjoy the music, the performance, etc. (I try avoiding music that I don’t like in first place, though).

Keep in mind, though, that I cannot cover all of the above aspects in my texts.

Using Sheet Music / Scores

If I have the score / sheet music at hand, this certainly helps argumenting in some of the aspects mentioned above. In addition, it gives me a better idea about whether a performance is close to the composer’s intent.

Familiar Compositions

Moreover, if I know a composition from concert performances, radio, streaming platforms, etc., this definitely facilitates the argumentation. Experience with a composition is helpful in finding words for a comment, as I may already have an idea about how (in my opinion) a piece should sound. I try avoiding bias / prejudice, but experience still helps.

With familiar artists (artists that I have watched / heard on YouTube, in concert, etc., maybe even artists that I know personally) I may have a preconceived opinion about what to expect. Still, I try to reassess a performance / recording anew, being sincere, detailed and truthful. This isn’t always easy. However, it helps to know that in general, artists appreciate genuine, frank opinions.

3-star rating

Media Comparisons

In the above situation (single recording), the reader is very much “in the air”, i.e., one may or may not trust my judgement of a given recording. Still, the vast collection of reviews in my blog will / should assist in assessing what my personal preferences and criteria are. This should help classifying my comments and the value of my judgements in relation to your own views, taste, preferences, etc.

Comparisons of multiple recordings remain as much personal and almost as subjective as the above. However, comparisons in tempo, articulation, dynamics, agogics, phrasing and overall flow, etc. give concrete evidence in argumenting, in judging recordings. And this also may give the reader a better handle on how my ratings correlate with their own preferences etc.

In the case of collections / many movements / many recordings, I often create color-coded tables indicating my track ratings (from my Apple Music library), as well as comparisons of durations or metronome readings. This serves to visualize comparisons (tempo or timings, and ratings). On the page “Pictures, Tables, Measurements”, I posted remarks on how I measure metronome numbers and durations.

I may point out the value of historic recordings (e.g., by famous / great conductors / artists of the past), but I don’t regard historicity by itself a quality criterion. I’m most(ly) interested by today’s performances by today’s artists.

My Ratings in (Live) Concert Reviews

Media and concert reviews differ by the very nature of their “object”. In concerts, the listener (inevitably) witnesses the interaction between the artists and the audience. Or the absence of such interaction. In addition one can observe the artist’s action at the instrument, and often also the artist’s emotional expression, either in response to the music, or in living within / in tune with the music and the instrument, the playing.

For the reviewer, it is easy to get “pulled in” into a performance. It is far easier than in the indirect experience of a CD recording. On the other hand, there is a certain danger of getting “turned off” where artists avoid showing any emotion or commotion, and or does not visibly interact with the audience at all. As a critic, one should be able to abstract from these aspects.

On the other hand, to me, a concert critique would be incomplete if it were to limit itself to aspects of performance, sound, and acoustics (as if the critic had been wearing black glasses and hadn’t taken any notice of the audience’s reaction). At least in my own critiques, I also want to give an impression of the overall concert experience. This inevitably involves some subjective, emotional components. Ideally, I try to point these out to the reader.

As for the rating in concerts: I use two different approaches:

3.5-star rating

“Known” Music

How I prepare myself for reviewing a concert is often a matter of time and resources. Ideally, I try listening to other performances beforehand, from my collection, or via audio or video streaming platforms. That is, unless I’m already intimately familiar with a composition, of course. I try organizing (downloading) scores onto my tablet. During the concert, ideally, I keep following the score on my iPad, while frantically scribbling notes in order to memorize as much as possible. And I may be taking photos in parallel (if that is permitted only, of course).

My notes typically include spontaneous ratings. I’m covering as many as possible of the performance aspects mentioned above. On top of that my review is also reflecting my personal concert experience, the atmosphere, the artists’ action and interaction. After the performance, or when writing up the review, I may still add corrections to the rating, in order to maintain consistency with existing reviews.

4-star rating

New and Unknown Music

For new and unknown music (which may include also Renaissance or 20th century music for which sheet music or a score is not readily available), my “objective” reviewing and rating criteria are somewhat limited, as outlined above. Inevitably, my personal impression, my own emotional response will play a much larger role than with known (e.g., baroque, classic, or romantic) music. With that, the rating becomes more personal. Consequently (as I can’t support my arguments with information from a score), I’m relying upon the reader’s trust in my judgement, be it from similar reviews in my blog, or in fact from (m)any of the existing blog postings.

This brings about a last question: how do I listen to new(est) and/or unknown music? Actually, I don’t have much of a detailed recipe. I try to stay unbiased, open, while letting the music “work on me” (on my mind, my emotions, my imagination). Consequently, my reviews then will primarily be descriptive. I’ll mention my impressions, as well as my spontaneous associations, either with other music, or with scenery / situations / pictures that spring to mind.

All this requires a high amount of focus and attention. However, in my experience, the complexity of managing all of the above tasks is not distracting. Quite to the contrary: the concert impression becomes more intense, rich, deep and detailed. Active listening brings about a far more immersive experience, compared to a situation where I’m just sitting there, listening passively.

4.5-star rating


Note that I’m only reviewing and rating performances and music from genres and periods that I’m familiar with (no Jazz, no Folk, etc.), see also my separate document covering this topic in more detail. And within that scope, the above differentiation between concert and media reviews, as well as between reviews covering a single performance vs. media comparisons apply.

Points of Debate?

Negative Feedback?

Over all the time that I spent on my blog so far, I have had very few instances of negative feedback. One was from a friend who took a critical review of an artist whom he admired personal—and sent me a patronizing message that was insulting enough for me to unfriend that connection. Then, there were about 2 – 3 instances of negative feedback from readers who either did not read my text carefully enough (or failed to understand my crooked English syntax) or simply did not understand that this is a private blog, and I have the legal right to utter my personal opinion.

These are rare exceptions, however: the overwhelming part of responses from readers, artists, and agents is positive, if not enthusiastic (see also my Testimonials page). One other exception to this comes to mind, which luckily I was able to resolve:

The Dilemma with Multi-Recitals

In a recital with several artists (all young and at the end of their education at the conservatory) I tried my best at being consistent and coherent with my general rating practice. None of the performances was bad at all. However, to me, there were clear differences in the performance levels of these artists, and I wanted my rating to reflect this difference. Hence, there were artists with a ★★★ rating, and others with ★★★★. None of the artists in that recital received any rating below ★★★.

One of the artists with a ★★★ overall result contacted me, telling me that he felt that his “bad” result was unfair and did not reflect his sincere, professional effort to achieve best, if not perfect results. I explained to him a) my dilemma with the need for differentiation, and b) that ★★★ is not bad at all, see also my page “Typography, Conventions“. In the end, we parted in agreement and in a friendly atmosphere. And yes, we still talk to each other!

5-star rating


Whether you agree or disagree with my ratings: remember the motto of this blog: it’s all just my personal opinion—whatever this is worth. I do mean what I say & write, and I try my best to be sincere and truthful to my personal impressions, my modest opinion, my own emotional response. Take my judgement as one single “datapoint”, a stone in a puzzle of opinions. Other judgements are perfectly acceptable and even welcome (feel free to utter these in comments!): I’m not “in possession of the absolute truth”.

Let me close with the following: Trust me—rating media or concert performances is not something I do lightly, even though it is “just my personal, subjective opinion”!

The Author in the audience @ Landenberghaus, Greifensee, 2020-09-11 (© Rolf Kyburz)

Finally, let me refer to two other documents in this blog, containing related information:

AboutImpressum, LegalSite Policy | TestimonialsAcknowledgementsBlog Timeline
Typography, ConventionsWordPress Setup | Resources, ToolsTech/Methods/Pics/Photography

Leave a Comment