Most Frequently Similar on Lesbian Sex stories

There are 8397 distinct stories in the Similar Stories list. Of those, 5374 are Lesbian Sex stories, or 64%. So 30% of all LS stories appear as a Similar Story for a LS story. Conversely, 70% of all lesbian sex stories don't appear as a Similar Story for any LS story.

Chapter stories complicate the interpretation. Is it correct that only stand alone stories and first chapters appear as Similar Stories? Can you tell how many stand-alone stories and distinct series there are in LS? It would also be interesting to see the figures for a few of the least popular stories that made it onto an SS list. Can you verify that there are some stand-alone LS stories that did not make it onto any LS SS list?

Just curious. All this data mining is pretty fascinating, to me, at least.
 
Chapter stories complicate the interpretation. Is it correct that only stand alone stories and first chapters appear as Similar Stories?
That is generally true. What is a series and what is the first chapter can be odd at times, but that is noise.

Can you tell how many stand-alone stories and distinct series there are in LS?
There are 8,842 LS stories that Literotica has deemed in a series and 10,074 that Literotica has deemed to be stand-alone. I can't tell you how many distinct series there are.

It would also be interesting to see the figures for a few of the least popular stories that made it onto an SS list. Can you verify that there are some stand-alone LS stories that did not make it onto any LS SS list?
Sketching Sophie is the story with best statistics that doesn't appear as a similar story. 4.87 rating, 156 favorites, 61 comments, published 1/11/2020, 5 tags. Mac and Me is next with a 4.82 rating, 153 favorites, 51 comments, published 12/12/2008, 10 tags.
 
Chapter stories complicate the interpretation. Is it correct that only stand alone stories and first chapters appear as Similar Stories? Can you tell how many stand-alone stories and distinct series there are in LS? It would also be interesting to see the figures for a few of the least popular stories that made it onto an SS list. Can you verify that there are some stand-alone LS stories that did not make it onto any LS SS list?

Just curious. All this data mining is pretty fascinating, to me, at least.

There are exceptions - I mentioned a Chapter 6 where four of the Similar Stories are previous instalments in the same work. But I don't see it often.

It might be a deliberate part of the algorithm to avoid recommending sequel chapters when there's any alternative, or it might just be a consequence of sequels generally having fewer votes and favourites.

EDIT: as per below discussion, what seems to have happened here is that for reasons unknown, Literotica didn't recognise this as a series even though the chapters are titled in series.
 
Last edited:
Number of favorites is a big factor in why a story is considered similar to another story. Rating is another major factor. I picked four stories that were on my list of most common similar stories ("Lovers Without Realizing It", "Teen Turns: Boyfriend’s Mom", "What She Wants Is You" and "The Return of Dr. Mecuniam"). I think threw all the stand-alone stories that had those as Similar Stories into ratings bucket. In my data set, there are 504 stand-alone stories with a rating in the range of 4.55 to 4.59. 233 had "Lovers Without Realizing It" as a Similar Story, 16 had "Teen Turns: Boyfriend’s Mom" as a similar story, 2 had "What She Wants Is You" as a Similar Story, and 1 had "The Return of Dr. Mecuniam" as a similar story. As you can see, a story is picked as similar the closer it's rating is to the story's rating.
[tr][td].[/td][td]
Number​
[/td][td]
4.83​
[/td][td]
Percent​
[/td][td]
4.60​
[/td][td]
Percent​
[/td][td]
4.17​
[/td][td]
Percent​
[/td][td]
3.67​
[/td][td]
Percent​
[/td][/tr][tr][td].[/td][td]
of Stand​
[/td][td]
Rated​
[/td][td]
Stories​
[/td][td]
Rated​
[/td][td]
Stories​
[/td][td]
Rated​
[/td][td]
Stories​
[/td][td]
Rated​
[/td][td]
Stories​
[/td][/tr][tr][td]Range[/td][td]
Alone Stories​
[/td][td]
Story​
[/td][td]
in that range​
[/td][td]
Story​
[/td][td]
in that range​
[/td][td]
Story​
[/td][td]
in that range​
[/td][td]
Story​
[/td][td]
in that range​
[/td][/tr][tr][td]4.90-4.94[/td][td]
9​
[/td][td]
5​
[/td][td]
55.56%​
[/td][td]
0​
[/td][td]
0.00%​
[/td][td]
0​
[/td][td]
0.00%​
[/td][td]
0​
[/td][td]
0.00%​
[/td][/tr][tr][td]4.85-4.89[/td][td]
49​
[/td][td]
39​
[/td][td]
79.59%​
[/td][td]
0​
[/td][td]
0.00%​
[/td][td]
0​
[/td][td]
0.00%​
[/td][td]
0​
[/td][td]
0.00%​
[/td][/tr][tr][td]4.80-4.84[/td][td]
131​
[/td][td]
91​
[/td][td]
69.47%​
[/td][td]
0​
[/td][td]
0.00%​
[/td][td]
0​
[/td][td]
0.00%​
[/td][td]
0​
[/td][td]
0.00%​
[/td][/tr][tr][td]4.75-4.79[/td][td]
231​
[/td][td]
168​
[/td][td]
72.73%​
[/td][td]
4​
[/td][td]
1.73%​
[/td][td]
0​
[/td][td]
0.00%​
[/td][td]
0​
[/td][td]
0.00%​
[/td][/tr][tr][td]4.70-4.74[/td][td]
310​
[/td][td]
192​
[/td][td]
61.94%​
[/td][td]
5​
[/td][td]
1.61%​
[/td][td]
0​
[/td][td]
0.00%​
[/td][td]
0​
[/td][td]
0.00%​
[/td][/tr][tr][td]4.65-4.69[/td][td]
401​
[/td][td]
229​
[/td][td]
57.11%​
[/td][td]
17​
[/td][td]
4.24%​
[/td][td]
2​
[/td][td]
0.50%​
[/td][td]
2​
[/td][td]
0.50%​
[/td][/tr][tr][td]4.60-4.64[/td][td]
445​
[/td][td]
247​
[/td][td]
55.51%​
[/td][td]
21​
[/td][td]
4.72%​
[/td][td]
0​
[/td][td]
0.00%​
[/td][td]
0​
[/td][td]
0.00%​
[/td][/tr][tr][td]4.55-4.59[/td][td]
504​
[/td][td]
233​
[/td][td]
46.23%​
[/td][td]
16​
[/td][td]
3.17%​
[/td][td]
2​
[/td][td]
0.40%​
[/td][td]
1​
[/td][td]
0.20%​
[/td][/tr][tr][td]4.50-4.54[/td][td]
535​
[/td][td]
221​
[/td][td]
41.31%​
[/td][td]
18​
[/td][td]
3.36%​
[/td][td]
3​
[/td][td]
0.56%​
[/td][td]
3​
[/td][td]
0.56%​
[/td][/tr][tr][td]4.45-4.49[/td][td]
662​
[/td][td]
257​
[/td][td]
38.82%​
[/td][td]
36​
[/td][td]
5.44%​
[/td][td]
3​
[/td][td]
0.45%​
[/td][td]
1​
[/td][td]
0.15%​
[/td][/tr][tr][td]4.40-4.44[/td][td]
657​
[/td][td]
206​
[/td][td]
31.35%​
[/td][td]
30​
[/td][td]
4.57%​
[/td][td]
6​
[/td][td]
0.91%​
[/td][td]
4​
[/td][td]
0.61%​
[/td][/tr][tr][td]4.35-4.39[/td][td]
629​
[/td][td]
164​
[/td][td]
26.07%​
[/td][td]
42​
[/td][td]
6.68%​
[/td][td]
10​
[/td][td]
1.59%​
[/td][td]
3​
[/td][td]
0.48%​
[/td][/tr][tr][td]4.30-4.34[/td][td]
615​
[/td][td]
130​
[/td][td]
21.14%​
[/td][td]
25​
[/td][td]
4.07%​
[/td][td]
13​
[/td][td]
2.11%​
[/td][td]
10​
[/td][td]
1.63%​
[/td][/tr][tr][td]4.25-4.29[/td][td]
602​
[/td][td]
125​
[/td][td]
20.76%​
[/td][td]
30​
[/td][td]
4.98%​
[/td][td]
16​
[/td][td]
2.66%​
[/td][td]
11​
[/td][td]
1.83%​
[/td][/tr][tr][td]4.20-4.24[/td][td]
550​
[/td][td]
100​
[/td][td]
18.18%​
[/td][td]
34​
[/td][td]
6.18%​
[/td][td]
19​
[/td][td]
3.45%​
[/td][td]
10​
[/td][td]
1.82%​
[/td][/tr][tr][td]4.15-4.19[/td][td]
545​
[/td][td]
107​
[/td][td]
19.63%​
[/td][td]
23​
[/td][td]
4.22%​
[/td][td]
23​
[/td][td]
4.22%​
[/td][td]
11​
[/td][td]
2.02%​
[/td][/tr][tr][td]4.10-4.14[/td][td]
400​
[/td][td]
54​
[/td][td]
13.50%​
[/td][td]
12​
[/td][td]
3.00%​
[/td][td]
17​
[/td][td]
4.25%​
[/td][td]
10​
[/td][td]
2.50%​
[/td][/tr][tr][td]4.05-4.09[/td][td]
369​
[/td][td]
40​
[/td][td]
10.84%​
[/td][td]
16​
[/td][td]
4.34%​
[/td][td]
16​
[/td][td]
4.34%​
[/td][td]
9​
[/td][td]
2.44%​
[/td][/tr][tr][td]4.00-4.04[/td][td]
351​
[/td][td]
46​
[/td][td]
13.11%​
[/td][td]
16​
[/td][td]
4.56%​
[/td][td]
25​
[/td][td]
7.12%​
[/td][td]
19​
[/td][td]
5.41%​
[/td][/tr][tr][td]3.90-3.99[/td][td]
497​
[/td][td]
45​
[/td][td]
9.05%​
[/td][td]
20​
[/td][td]
4.02%​
[/td][td]
25​
[/td][td]
5.03%​
[/td][td]
19​
[/td][td]
3.82%​
[/td][/tr][tr][td]3.80-3.89[/td][td]
377​
[/td][td]
27​
[/td][td]
7.16%​
[/td][td]
28​
[/td][td]
7.43%​
[/td][td]
23​
[/td][td]
6.10%​
[/td][td]
16​
[/td][td]
4.24%​
[/td][/tr][tr][td]3.70-3.79[/td][td]
268​
[/td][td]
22​
[/td][td]
8.21%​
[/td][td]
6​
[/td][td]
2.24%​
[/td][td]
21​
[/td][td]
7.84%​
[/td][td]
13​
[/td][td]
4.85%​
[/td][/tr][tr][td]3.60-3.69[/td][td]
211​
[/td][td]
14​
[/td][td]
6.64%​
[/td][td]
8​
[/td][td]
3.79%​
[/td][td]
19​
[/td][td]
9.00%​
[/td][td]
13​
[/td][td]
6.16%​
[/td][/tr][tr][td]3.50-3.59[/td][td]
126​
[/td][td]
5​
[/td][td]
3.97%​
[/td][td]
7​
[/td][td]
5.56%​
[/td][td]
13​
[/td][td]
10.32%​
[/td][td]
10​
[/td][td]
7.94%​
[/td][/tr][tr][td]3.40-3.49[/td][td]
114​
[/td][td]
4​
[/td][td]
3.51%​
[/td][td]
1​
[/td][td]
0.88%​
[/td][td]
14​
[/td][td]
12.28%​
[/td][td]
13​
[/td][td]
11.40%​
[/td][/tr][tr][td]3.30-3.39[/td][td]
69​
[/td][td]
2​
[/td][td]
2.90%​
[/td][td]
2​
[/td][td]
2.90%​
[/td][td]
13​
[/td][td]
18.84%​
[/td][td]
6​
[/td][td]
8.70%​
[/td][/tr][tr][td]3.20-3.29[/td][td]
57​
[/td][td]
3​
[/td][td]
5.26%​
[/td][td]
0​
[/td][td]
0.00%​
[/td][td]
9​
[/td][td]
15.79%​
[/td][td]
8​
[/td][td]
14.04%​
[/td][/tr][tr][td]3.10-3.19[/td][td]
47​
[/td][td]
1​
[/td][td]
2.13%​
[/td][td]
3​
[/td][td]
6.38%​
[/td][td]
6​
[/td][td]
12.77%​
[/td][td]
5​
[/td][td]
10.64%​
[/td][/tr][tr][td]3.00-3.09[/td][td]
33​
[/td][td]
1​
[/td][td]
3.03%​
[/td][td]
1​
[/td][td]
3.03%​
[/td][td]
3​
[/td][td]
9.09%​
[/td][td]
2​
[/td][td]
6.06%​
[/td][/tr][tr][td]2.75-2.99[/td][td]
50​
[/td][td]
3​
[/td][td]
6.00%​
[/td][td]
2​
[/td][td]
4.00%​
[/td][td]
4​
[/td][td]
8.00%​
[/td][td]
1​
[/td][td]
2.00%​
[/td][/tr][tr][td]2.50-2.74[/td][td]
21​
[/td][td]
0​
[/td][td]
0.00%​
[/td][td]
0​
[/td][td]
0.00%​
[/td][td]
2​
[/td][td]
9.52%​
[/td][td]
2​
[/td][td]
9.52%​
[/td][/tr][tr][td]2.25-2.49[/td][td]
9​
[/td][td]
0​
[/td][td]
0.00%​
[/td][td]
0​
[/td][td]
0.00%​
[/td][td]
2​
[/td][td]
22.22%​
[/td][td]
2​
[/td][td]
22.22%​
[/td][/tr][tr][td]2.00-2.24[/td][td]
12​
[/td][td]
0​
[/td][td]
0.00%​
[/td][td]
1​
[/td][td]
8.33%​
[/td][td]
2​
[/td][td]
16.67%​
[/td][td]
2​
[/td][td]
16.67%​
[/td][/tr][tr][td]1.50-1.99[/td][td]
5​
[/td][td]
0​
[/td][td]
0.00%​
[/td][td]
0​
[/td][td]
0.00%​
[/td][td]
2​
[/td][td]
40.00%​
[/td][td]
2​
[/td][td]
40.00%​
[/td][/tr][tr][td]1.00-1.49[/td][td]
1​
[/td][td]
0​
[/td][td]
0.00%​
[/td][td]
0​
[/td][td]
0.00%​
[/td][td]
0​
[/td][td]
0.00%​
[/td][td]
0​
[/td][td]
0.00%​
[/td][/tr]
 
There are exceptions - I mentioned a Chapter 6 where four of the Similar Stories are previous instalments in the same work. But I don't see it often.

It might be a deliberate part of the algorithm to avoid recommending sequel chapters when there's any alternative, or it might just be a consequence of sequels generally having fewer votes and favourites.
From what I've seen, it will always pick the first story in a series. In the case of the Chapter 6 story, Literotica might have decided that some chapters in that series where in fact another series, starting with Chapter 6. You can see an example of that here.
 
From what I've seen, it will always pick the first story in a series. In the case of the Chapter 6 story, Literotica might have decided that some chapters in that series where in fact another series, starting with Chapter 6. You can see an example of that here.

Ah, yes, you're right. For some reason Lit has not registered the chapters I was looking at as a series, even though the author appears to have titled them correctly for a series.
 
I am surprised similar stories are frequently in different categories. I would have imagined a default setting to be ‘same category.’

Also interesting is finding two of my twenty other stories (one same category, one not) listed in the similar stories section. This suggests the author is a feature in their algorithm.
 
Number of favorites is a big factor in why a story is considered similar to another story. Rating is another major factor.

I'm not so sure ratings is a factor. I think it's favorites. It stands to reason that stories more often favorited also will have higher scores, on average. I think that's what we are seeing -- stories on Favorites lists tend to have higher scores, not because the algorithm looks at higher scores, but because stories that have the most favorites have higher scores.

I think the key determination of whether a story shows up on the Similar Stories list is looking at the number of readers that have favorited both stories. That seems most consistent to me with what I see on my stories' Similar Stories lists. It explains why the stories on these lists tend to have very high views, votes, and scores. It also explains why stories can show up on Similar Stories lists without being in the same category or sharing the same tags.
 
Also interesting is finding two of my twenty other stories (one same category, one not) listed in the similar stories section. This suggests the author is a feature in their algorithm.

This fact, too, is explained by the "common favorites" algorithm. It stands to reason that you have readers who have favorited multiples of your stories, and that if readers have favorited your story A they are more likely to favorite your story B.
 
Looking more at the data, this post is wrong. I need to do more research, but it looks like a huge part of what determines the Similar Story is the number of Favorites of the story.

<--- Start ignoring from here --->
Looking at my table yesterday, you'll see that 44 out of the 58 stand-alone stories with a rating of 4.85 or better have "Lovers Without Realizing It" as a similar story. Focusing on those 58 stories, I found that have a high number of favorites is associated with having "Lovers Without Realizing It" as a similar story. The top 28 of those 58 stories in terms of number of favorites have "Lovers Without Realizing It" as a similar story. The bottom 6 (and 8 of the bottom 9) don't have "Lovers Without Realizing It" as a similar story.

Then I decided to look at a story that had a high rating, a relatively low number of favorites for its rating, and a decent number of Similar Stories. I picked 2015 - Kiss Me, You Fool by AnAngelReally. Of the six stories that have it as a Similar Story, four are also written by AnAngelReally. The other two stories that had "2015 - Kiss Me, You Fool" seemed to have nothing in common with "2015 - Kiss Me, You Fool".

So when the Similar Stories is being determined for story S1, I think a score is calculated for all Literotica stories. I think story S2 gets points for:
* The number of favorites S2 has
* S1 and S2 being in the same category
* The absolute of Rating(S1) - Rating(S2)
* The absolute of Number of Favorites(S1) - Number of Favorites(S2)
* S1 and S2 having the same author
<--- Stop ignoring --->
 
Last edited:
Looking at my table yesterday, you'll see that 44 out of the 58 stand-alone stories with a rating of 4.85 or better have "Lovers Without Realizing It" as a similar story. Focusing on those 58 stories, I found that have a high number of favorites is associated with having "Lovers Without Realizing It" as a similar story. The top 28 of those 58 stories in terms of number of favorites have "Lovers Without Realizing It" as a similar story. The bottom 6 (and 8 of the bottom 9) don't have "Lovers Without Realizing It" as a similar story.

Then I decided to look at a story that had a high rating, a relatively low number of favorites for its rating, and a decent number of Similar Stories. I picked 2015 - Kiss Me, You Fool by AnAngelReally. Of the six stories that have it as a Similar Story, four are also written by AnAngelReally. The other two stories that had "2015 - Kiss Me, You Fool" seemed to have nothing in common with "2015 - Kiss Me, You Fool".

So when the Similar Stories is being determined for story S1, I think a score is calculated for all Literotica stories. I think story S2 gets points for:
* The number of favorites S2 has
* S1 and S2 being in the same category
* The absolute of Rating(S1) - Rating(S2)
* The absolute of Number of Favorites(S1) - Number of Favorites(S2)
* S1 and S2 having the same author

Question: Why in the world use "similarity of ratings" as a criterion for creating a similar stories list? I suppose it might yield more similar results, but not helpful ones. It's seems fair to suppose that when a reach reaches the end of a story and looks at the list, the reader is going to want to see not just similar stories, but the similar stories with the highest scores. Nobody is deliberately looking for badly ranked stories. It doesn't make sense.

I wonder if it's possible that the correlation of scores that's evident in the table could reflect something else, such as reader preference. A reader might not deliberately look for a bad story, but he my gravitate toward stories that just happen to have bad scores.

Another way of doing this would be to look at your highest rated story and your lowest rated story, and check out all the stories on their similar stories lists, and see how they rank in terms of ratings and favorites.
 
I wondered how big the pool of possible candidates for LS Similar Stories is. Not counting stories from other categories, the pool would seem to consist of all LS stand-alone stories and all LS series. So how many LS series are there?

Searching the LS category for titles containing the text "ch. 01" yields 1230 stories. However, searching for "ch. 02" yields 1593, because some series don't start numbering until chapter 2. Adding in titles containing "ch. 2" (149) and "pt. 02" (422) gives 2164. So presumably there are at least that many series, although, as has been pointed out, not all series use chapter (or part) numbers.

There is also now a "Story Series" page accessible from the main Story Contents page. It lists 3556 LS series. However, this includes some series that have been taken down and some that don't start in LS but contain a later chapter in LS. These wouldn't show up as LS series in 8letter's tabulation.

So, presumably there are somewhere between 2164 and 3556 series that start in LS and are possible candidates for SS lists. Not sure whether this shed's light on anything or not.
 
One of the pleasures of roaming the stacks is that you sometimes come across delightful treasures that have been sitting there patiently on the shelf all these years just waiting for someone to take them down and give them another glance.

In the course of my recent bibliographic investigations, I happened upon this lovely little series: Hail. Two charming vignettes in two very short chapters that describe an orphan girl from the streets being delivered to a convent and her encounter with the young novitiate charged with looking after her. The writing is crisp and playful, the heroine's feelings and inner monologue as clear and recognizable as a bell. A true pleasure to read.

The author gentlebutfirm has submitted a total of 10 stories, the most recent in 2006. His or her bio states, "My primary motivation in publishing . . . is to receive feedback. I'd love to know what you think." Further investigation showed that his or her entire wonderful body of work has received a total of ten gaunt comments in all the time it has been posted. It made me sad.

I like to leave comments on stories that I like, thanking the author and telling them what it was I liked. I wish my comments were more visible to other readers. I mean, they're there, but they're tucked away just like the stories themselves, back in the stacks, up on the shelves. I can see all the comments I've posted myself as a little anthology on the My Activity tab of my User Page. I wish others could access my anthology as well.

The next closest thing to that is the Favorites List. (There has been talk of an expanded "lists" capability, but I'm not sure how it works or whether it is even up yet.) I used to reserve favoriting for stories I thought were the very best of the best. I didn't want anybody pooh-poohing my selections. But now I'm starting to think that irishsexstorylover's approach is probably better.

From now on, if I read a story and I like it, and I think it deserves to be read, I'm going to favorite it. Even if it isn't the best of the best, as long as I would be glad to have someone who likes the kind of stories that I like recommend it to me. Chances are I'll leave a comment, too.

My tastes and predilections are not exactly mainstream. I like situation, wit, thoughtfulness, and generosity of spirit even more than titillation (even more than statistics). Hopefully others will find my list useful, even if it does eventually grow to be a kajillion entries long. And hopefully it will help a few deserving stories stay on, or find their way back onto, appreciative screens.
 
Question: Why in the world use "similarity of ratings" as a criterion for creating a similar stories list? I suppose it might yield more similar results, but not helpful ones. It's seems fair to suppose that when a reach reaches the end of a story and looks at the list, the reader is going to want to see not just similar stories, but the similar stories with the highest scores. Nobody is deliberately looking for badly ranked stories. It doesn't make sense.

I wonder if it's possible that the correlation of scores that's evident in the table could reflect something else, such as reader preference. A reader might not deliberately look for a bad story, but he my gravitate toward stories that just happen to have bad scores.

I suspect this is correct. Some themes do better in LS than others (e.g. long romance pieces are more likely to score high than short stroke). Favouriting patterns will tend to follow themes; if recommendations are influenced by favouriting patterns, then that's all you need to create a "similar stories tend to have similar scores" effect without explicitly building it into the algorithm.
 
One of the pleasures of roaming the stacks is that you sometimes come across delightful treasures that have been sitting there patiently on the shelf all these years just waiting for someone to take them down and give them another glance.

This would make an interesting thread - recommend an old, unappreciated story for some reader love.
 
The author gentlebutfirm has submitted a total of 10 stories, the most recent in 2006. His or her bio states, "My primary motivation in publishing . . . is to receive feedback. I'd love to know what you think." Further investigation showed that his or her entire wonderful body of work has received a total of ten gaunt comments in all the time it has been posted. It made me sad.
I read the Hail story, and what a little sweetie it was. Reminded me of some late Victorian smut, which I guess is what it was - setting aside a few contemporary anachronisms. But yes, some brave aspirations dashed. I wonder if the author ever comes back to see something they wrote fifteen years ago?
 
How do Similar Stories get picked for a story? I'm going to zig and zag here from the most powerful explanation to the next most powerful explanation.

silkstockingslover is the most favorited author on Literotica. She has in my data set 116 Lesbian Sex stories. Her Similar Story links are highly self-reflective - 85% of them are to her own stories, many of them in categories besides Lesbian Sex. 52 are to the Noncon story "Owning My Friend's Mom" (9%). I have no idea why that story is selected so much. The most favorited LS story "Lovers Without Realizing It" shows up 18 times (3%). Other stories show up 17 times.

"Lovers Without Realizing It" is the most favorited Lesbian Sex story and the one who comes up most often as a Similar Story link. The choosing of "Lovers Without Realizing It" as a Similar story appears to be associated with the number of favorites a story has. Setting aside silkstocklinglover's stories, of the top 100 most favorited stories, 92% have "Lovers Without Realizing It" as a Similar Story. Of the next 300 most favorited stories, 91.3% have "Lovers Without Realizing It" as a Similar Story. Of the next 509 most favorited stories, 378 (74%) have "Lovers Without Realizing It" as a Similar Story. As the number of favorites decrease, the frequency that "Lovers Without Realizing It" is picked as a Similar Story decreases. Looking at just stories that Literotica thinks are stand-alone stories vs all stories doesn't change the frequency much.

Jumping to something completely different - of the 388 stories with exactly one favorite:
* 277 (71%) have "The Wife, the Artichoke, and Jade" as a Similar Story
* 277 have "The Spring of 1984" as a Similar Story
* 277 have "The Return of Dr. Mecuniam" as a Similar Story
* 277 have "What She Wants Is You" as a Similar Story
* 277 have the N&N "The Newlin's, Marcie & Mark Pt. 01" as a Similar Story
Five stories tie for the next most common Similar Story at 10 references each. If I restrict the stories to just those that Literotica thinks are stand-alone stories, the percentage rises to 78%.

Returning to self-referential, some authors have a lot of their Similar Story links refer to their own stories and some don't. Here's looking at the ones with the highest % self-links:
krr1957 - 17 stories, 51 self-links = 60%
careythomas - 11 stories, 31 self-links = 56%
deliciousthoughts - 16 stories, 44 self-links = 55%
watchdwag - 30 stories, 73 self-links = 49%
HeyAll - 56 stories, 134 self-links = 48%
VeryDirtyDenise - 16 stories, 38 self-links = 48%
MercuryLove31 - 30 stories, 64 self-links = 43%
Pixiehoff - 225 stories, 433 self-links = 38%
JukeboxEMCSA - 42 stories, 78 self-links = 37%
babylez - 14 stories, 25 self-links = 36%
Alex the Cat - 17 stories, 30 self-links = 35%
LustyLee77 - 39 stories, 61 self-links = 31%
herslave420 - 10 stories, 15 self-links = 30%
GymTeacherYouDeserve - 27 stories, 40 self-links = 30%
pujalata980 - 17 stories, 25 self-links = 29%
JuilaDixon - 10 stories, 14 self-links = 28%
Salandar - 10 stories, 14 self-links = 28%
KatPissinger - 24 stories, 33 self-links = 28%
ScattySue - 25 stories, 33 self-links = 26%
BrokenSpokes - 20 stories, 26 self-links = 26%
Almost 2/3's of the authors who have a Similar Story self-link have less than 10% of their Similar Stories be self-links. 2% or 3% is very common.

"A Proper Send-off" is the second-most favorited LS story and the second most often that comes up as Similar Story link, and, like "Lovers Without Realizing It", the most favorited stories most frequently have it as a Similar Story link, though not as much as "Lovers Without Realizing It". Of the top 100 stories, 86% have "A Proper Send-off" as a Similar Story. Of the next 300 most favorited stories, 65% have "A Proper Send-off" as a Similar Story. Of the next 509 stories, 47% have "A Proper Send-off" as a Similar Story.

The third-most favorited LS story, "Bedding the Babysitter Ch. 01", comes up the third most often as a Similar Story link. However, it being selected as a Similar Story does not seem to be associated with the number of favorites. Or the story rating. What it does seem to be associated with is the number of views. 49.2% of the top 500 most viewed LS stories have "Bedding the Babysitter Ch. 01" as a Similar Story. The next 500 drop to 38.6%. The next 1000 drop to 31.2%.

The fourth-most favorited LS story, "Like A Secret Being Told", comes up the sixth most often as a Similar Story link. Like "Lovers Without Realizing It" and "A Proper Send-off", the most favorited stories most frequently have it as a Similar Story link. Of the top 100 stories, 71% have "Like A Secret Being Told" as a Similar Story. Of the next 300 most favorited stories, 45% have "Like A Secret Being Told" as a Similar Story. Of the next 509 stories, 33% have "Like A Secret Being Told" as a Similar Story.

The story that come up fourth most frequently as a Similar Story is "The Case of the Sneaky Valentine", which is the 7th most favorited stories. It is an unusually long story at 12 pages, and it's frequency of selection as a Similar Story is associated with the number of pages that a story is. Of stories 10 pages or longer, "The Case of the Sneaky Valentine" is a Similar Story 48% of the time. Of stories 8 or 9 pages long, the percentage is 29%. Of stories 6 or 7 pages long, the percentage is 24%. Of stories 4 or 5 pages long, the percentage is 21%.

The story that come up fifth most frequently as a Similar Story is "Could You Be Mine?", which is the 11th most favorited stories. I see nothing special about the story that would explain why it appears so much more frequently as a Similar Story than the other stories in the Top 15 most frequently favorited stories. "Could You Be Mine?" being selected as a Similar Story is associated with the story's rating. Of stories with a 4.89 rating or better, 36% have "Could You Be Mine?" as a Similar Story. Of stories with a rating of 4.88 or 4.87, the percentage is 45%. Of stories with a rating of 4.86 or 4.85, the percentage is 37%. Of stories with a rating of 4.84 or 4.83, the percentage is 39%. Of stories with a rating of 4.82 or 4.81, the percentage is 27%. Of stories with a rating of 4.80, 4.79 or 4.78, the percentage is 23%.

I don't see any consistent explanation for how a story is selected as a Similar Story.
 
I agree with Bramblethorn that a good way to get insight into the way that Similar Stories are selected is by looking closely at the Similar Stories lists of some individual stories. I looked at several stories for which the Similar Stories lists are based on tags.

Observation 0.1. The new ("public beta") and old ("classic") story viewers don't always show the same Similar Stories. The new viewer shows 5 Similar Stories, but the list can be expanded to 20 by clicking the "More Stories" button. The old viewer shows only 5 Similar Stories. The two lists often show the same stories, but not always.

Observation 0.2. One key difference is in the way that chapter stories are handled. The old-style lists treat each chapter as a separate story. They sometimes show multiple chapters from the same series on the same SS list. The new-style lists treat each series as a single story. They only show a series once and usually refer to it by its first chapter,

Methodology. You can find a story's tags in the sidebar at the end of the story. You can find all the stories across the whole site that have a particular tag or combination of tags on the Story Tag Portal (STP). (You can get to the STP by clicking on one of the tags at the end of a story or by following the link on the F.A.Q./Stories/Tags page). You can find all the stories by a particular author that have a specific tag or combination of tags on the Search Stories page.

Story 1. Aimee and Will: Tendings, by let0423, L&T, 2009. This is an older story that has never been favorited. It has two obscure tags. This table shows the number of stories in each SS list and across the whole site that have each of these tags.

[tr][td].[/td] [td]
Number​
[/td] [td]of stories[/td][td]in:[/td][/tr] [tr][td]Tag_____[/td] [td]___old SSL[/td] [td]_new SSL[/td] [td]___STP[/td][/tr] [tr][td] aimee[/td] [td]
2​
[/td] [td]
8​
[/td] [td]
20​
[/td][/tr] [tr][td] will[/td] [td]
___3
[/td] [td]
___12
[/td] [td]
___30
[/td] [/tr] [tr][td]
total:​
[/td] [td]
5​
[/td] [td]
20​
[/td] [td]
50​
[/td] [/tr]
SSL = Similar Stories List, STP = Story Portal Tag​

Observation 1.1. Every story on both of the SS lists has one tag in common with Story 1. There is only one other story on the site that shares both tags: "Aimee and Will: Imaginings" by the same author.

Observation 1.2. In this case, the tags are the names of the characters in the story, and so matching them doesn't necessarily provide stories that are similar in any meaningful way. At best the recommended stories have a character with the same name. However, in at least one of the stories "will'' refers to a legal document, and so the similarity is even more tenuous.

Observation 1.3. The two lists treat series differently. For example, the old-style list includes "At Long Last: Gwen Ch. 03" and "At Long Last: Gwen Ch. 02", both of which have the tag "will." The new-style SS list doesn't include either of these chapters, but it does include "At Long Last: Gwen," the unnumbered first chapter of the series, which doesn't have the tag itself.

Observation 1.4. Both lists include stories published after 2009. This means they must both have been updated since Story 1 was published.

Observation 1.5. One of the stories on the new-style SS list is in German. It has the tag "will," but it doesn't show up on the standard "will" STP page. It turns out that German stories have their own Story Tag Portal. Who knew? Somehow, though, this German story ended up on the SS list of this English-language story.

Reverse Engineering 1.1. It seems fairly certain that both SS lists were drawn from the set of stories that have one tag in common with Story 1. In this case, since the tags are obscure, the pool of candidates is fairly small: 50 stories in all (or only 32 if each series is counted only once). The SS lists provide a reasonable sampling of this set, even if the similarity among the stories is only superficial.

Story 2. A Letter to Dani Ch. 01, by laplappapillon, L&T, 2009. This is another older story that has never been favorited. It has four tags that are less obscure.

[tr][td].[/td] [td]
Number​
[/td] [td]of stories[/td][td]in:[/td][/tr] [tr][td]Tag__________[/td] [td]___old SSL[/td] [td]_new SSL[/td] [td]______STP*[/td][/tr] [tr][td]second person pov[/td] [td]
4​
[/td] [td]
13​
[/td] [td]
~2300​
[/td][/tr] [tr][td]memories[/td] [td]
-​
[/td] [td]
-​
[/td] [td]
~300​
[/td][/tr] [tr][td]water[/td] [td]
-​
[/td] [td]
15​
[/td] [td]
~600​
[/td][/tr] [tr][td]nature[/td] [td]
____1
[/td] [td]
___12
[/td] [td]
_~500
[/td][/tr] [tr][td]
total:​
[/td] [td]
5​
[/td] [td]
40​
[/td] [td]
~3700​
[/td][/tr] [tr][td]Tag pair__________[/td] [td]
_____​
[/td] [td]
_____​
[/td] [td]
_____​
[/td][/tr] [tr][td]second person pov, memories[/td] [td]
-​
[/td] [td]
-​
[/td] [td]
-​
[/td] [/tr] [tr][td]second person pov, water[/td] [td]
-​
[/td] [td]
8​
[/td] [td]
10​
[/td][/tr] [tr][td]second person pov, nature[/td] [td]
-​
[/td] [td]
5​
[/td] [td]
9​
[/td][/tr] [tr][td]memories, water[/td] [td]
-​
[/td] [td]
-​
[/td] [td]
2​
[/td][/tr] [tr][td]memories, nature[/td] [td]
-​
[/td] [td]
-​
[/td] [td]
-​
[/td][/tr] [tr][td]water, nature[/td] [td]
____-
[/td] [td]
____7
[/td] [td]
___12
[/td][/tr] [tr][td]
total:​
[/td] [td]
0​
[/td] [td]
20​
[/td] [td]
33​
[/td] [/tr]
*The counts in the STP column don't include the references to Story 2 itself; i.e., they only include possible candidates for the SS lists.​

Observation 2.1. Every story on the old-style SS list has one tag in common with Story 2. The STP shows that there are about 3700 such stories across the site.

Observation 2.2. Every story on the new-style SS list has two tags in common with Story 2. The STP shows that there are only 33 such stories across the site, and no stories other than Story 2 itself that match more than two tags. The old- and new-style lists don't have any stories in common.

Observation 2.3. In this case, since the tags are descriptive, the stories on both lists presumably have a meaningful similarity with Story 2. The stories on the new-style list are presumably more similar since they match two tags.

Observation 2.43. Both SS lists include stories published after Story 2 itself.

Reverse Engineering 2.1. It seems fairly certain that the old-style SS list was drawn from the set of stories that have one tag in common with Story 2, and that the new-style list was drawn from the set of stories that have two tags in common. Requiring two matches reduced the pool of possible candidates from 3700 to 33. The new-style SS list includes more than half of the stories in the reduced pool.

Story 3. The CheongSam, by fr3div3r, EC, 10/03/21. This is a new story with three widely used tags.

[tr][td].[/td] [td]
Number​
[/td] [td]of stories[/td][td]in:[/td][/tr] [tr][td]Tag________[/td] [td]___old SSL[/td] [td]_new SSL[/td] [td]______STP[/td][/tr] [tr][td]asian[/td] [td]
-​
[/td] [td]
12​
[/td] [td]
~2900​
[/td][/tr] [tr][td]anal[/td] [td]
5​
[/td] [td]
18​
[/td] [td]
~37800​
[/td][/tr] [tr][td]sex[/td] [td]
____-
[/td] [td]
___10
[/td] [td]
_~3100
[/td][/tr] [tr][td]
total:​
[/td] [td]
5​
[/td] [td]
40​
[/td] [td]
~43800​
[/td] [/tr] [tr][td]Tag pair_____[/td][td]
_____​
[/td][td]
_____​
[/td] [td]
_____​
[/td][/tr] [tr][td]asian, anal[/td] [td]
-​
[/td] [td]
10​
[/td] [td]
~400​
[/td][/tr] [tr][td]asian, sex[/td] [td]
-​
[/td] [td]
2​
[/td] [td]
19​
[/td][/tr] [tr][td]anal, sex[/td] [td]
____-
[/td] [td]
____8
[/td] [td]
___~400
[/td][/tr] [tr][td]
total:​
[/td] [td]
0​
[/td] [td]
20​
[/td] [td]
~800​
[/td][/tr]

Observation 3.1. Every story on the old-style SS list has one tag in common with Story 3, and every story on the new-style list has two tags in common with it. There aren't any stories besides Story 3 itself that have all three tags.

Observation 3.2. This analysis was done on 10/03/21, the day Story 3 was published. The story had already been favorited, but the SS lists were still based on tags. Since then, the old-style SS list has been updated at least twice, while the new-style list has remained the same.

Reverse Engineering 3.2. As with Story 2, it seems fairly certain that the old-style SS list was drawn from the set of stories that have one tag in common with the story itself, and that the new-style list was drawn from the set of stories that have two tags in common with it. In this case requiring two tags reduced the candidate pool from ~43,000 to ~800. These are still fairly large pools and it could be that other factors such as category, rating, publication date, and popularity played a role in selecting the final SS lists.

Story 4. An Innocent in Paris Pt. 06, by Joannexx, LS, 10/04/21. This is another new story, the 6th chapter of a series. It has 10 tags, 8 of them widely used.

[tr][td].[/td] [td]
Number​
[/td] [td]of stories[/td][td]in:[/td][/tr] [tr][td]Tag________[/td] [td]___old SSL[/td] [td]_new SSL[/td] [td]______STP[/td][/tr] [tr][td]18 year old student[/td] [td]
-​
[/td] [td]
*​
[/td] [td]
1​
[/td][/tr] [tr][td]drugst[/td] [td]
-​
[/td] [td]
*​
[/td] [td]
~1300​
[/td][/tr] [tr][td]bdsm[/td] [td]
3​
[/td] [td]
*​
[/td] [td]
~18500​
[/td][/tr] [tr][td]marijuana[/td] [td]
-​
[/td] [td]
*​
[/td] [td]
~300​
[/td][/tr] [tr][td]smoking[/td] [td]
-​
[/td] [td]
*​
[/td] [td]
~900​
[/td][/tr] [tr][td]cigarette[/td] [td]
-​
[/td] [td]
*​
[/td] [td]
~200​
[/td][/tr] [tr][td]love[/td] [td]
1​
[/td] [td]
*​
[/td] [td]
~11000​
[/td][/tr] [tr][td]orgasm[/td] [td]
1​
[/td] [td]
*​
[/td] [td]
~5100​
[/td][/tr] [tr][td]inexperienced lesbian sex[/td] [td]
-​
[/td] [td]
*​
[/td] [td]
-​
[/td][/tr] [tr][td]corruption[/td] [td]
-​
[/td] [td]
*​
[/td] [td]
~700​
[/td][/tr]
*not counted due to gremlin gumption deficit​

Observation 4.1. Every story on the old-style SS list has one tag in common with Story 4.

Observation 4.2. The top story on the new-style SS list is a self reference to "An Innocent in Paris Pt. 01." Some of the earlier chapters of the series have more than 4 tags in common with Story 4.

The second story on the new-style list, "Corrupting Amber," has 4 tags in common with Story 4: "corruption," "drugs," "cigarette," and "smoking."

The next 16 stories on the new-style list have 3 tags in common with Story 4. For example, 6 stories have "bdsm," "love," and "orgasm," and 2 have "drugs," "corruption," and "smoking."

The last two stories on the new-style list have only 2 tags in common with Story 4. "Abby" has "love" and "orgasm." "Brainy Teen" has "corruption" and "drugs."

Observation 4.3. One of the stories on the new-style list is "Lustful Lyrics, Ch. 10." This is an exception to the rule that the new-style SS lists usually only refer to the first chapter of a series. This is a Chain Story, and it is the first of two chapters by Tara Cox. So, apparently, the new-style lists handle Chain Stories differently than other series.

Observation 4.4. This analysis was done on 10/04/21, the day Story 4 was published. It had already been favorited, but both SS lists were still based on tags. Since then the old-style SS list has been updated at least once, but the new-style list has not.

Reverse engineering 4.1. As in the other examples, the old-style SS list was almost certainly drawn from the set of stories that have one tag in common with Story 4.

The way in which the new-style list was chosen is not so clear. For example, I'm not sure whether there are any other stories on the site that have 4 tags in common with Story 4. Finding this out would require checking 70 possible tag combinations across 186 chapters. This is left as an exercise for the reader.

My guess is that the new selection algorithm tries to find several long tag combinations that produce non-empty sets of matches, and then draws the SS list from the resulting pool. Or something like that


Conclusions.

C1. Similar Stories lists for new stories and stories that have never been favorited are based on tags.

C2. Similar Stories lists are updated from time to time and can include stories that are more recent than the story itself.

C3. The site maintains two separate SS lists for each story: the old-style list (5 entries) shown in the old story viewer and the new-style list (20 entries) shown in the new story viewer.

C4. The new-style SS lists are "series aware." They don't show multiple chapters from the same series, and they usually refer to series by their first chapters, even when the similarity is based on tags in a later chapter.

C5. The site uses different algorithms to select the old- and new-style SS lists, at least as far as tags are concerned. The old algorithm looks for stories that have one tag in common with the story itself. The new algorithm looks for stories that have more than one tag in common. Because of this, the stories recommended by the new algorithm can be expected to be more appropriate than the ones recommended by the old algorithm.

C6. The algorithms don't have any real understanding of what the tags mean within the context of any particular story. Non-descriptive or ambiguous tags can result in Similar Stories with only superficial similarities.

C7. Further study is needed to understand the way Similar Stories are selected for stories that have been favorited and the possible role played by other factors such as category, rating, publication date, and popularity in the selection process.
 
Conclusions.

C1. Similar Stories lists for new stories and stories that have never been favorited are based on tags.

C2. Similar Stories lists are updated from time to time and can include stories that are more recent than the story itself.

C3. The site maintains two separate SS lists for each story: the old-style list (5 entries) shown in the old story viewer and the new-style list (20 entries) shown in the new story viewer.

C4. The new-style SS lists are "series aware." They don't show multiple chapters from the same series, and they usually refer to series by their first chapters, even when the similarity is based on tags in a later chapter.

C5. The site uses different algorithms to select the old- and new-style SS lists, at least as far as tags are concerned. The old algorithm looks for stories that have one tag in common with the story itself. The new algorithm looks for stories that have more than one tag in common. Because of this, the stories recommended by the new algorithm can be expected to be more appropriate than the ones recommended by the old algorithm.

C6. The algorithms don't have any real understanding of what the tags mean within the context of any particular story. Non-descriptive or ambiguous tags can result in Similar Stories with only superficial similarities.

C7. Further study is needed to understand the way Similar Stories are selected for stories that have been favorited and the possible role played by other factors such as category, rating, publication date, and popularity in the selection process.

My theory is that the algorithm initially is based on tags when a story has no favorites yet, but that this pattern gives way to a common-favorites based algorithm over time.

1. One way to gather more information on this would be to look at the favorites of one's own story immediately after publication, before it has gathered any favorites, and to see how and in what way its Similar Stories list changes over time.

2. I looked at my story Late Night on the Loveseat with Mom, which was published over four years ago and is my most favorited story. It's a mother-son incest/taboo story. One of its "Similar Stories" is The Sleepover by samslam. That is a brother-sister incest story, and it shares no common tags with my story.

3. My most-viewed Exhibitionist & Voyeur story is A Bikini With A Mind Of Its Own. One of the stories on its Similar Stories list is Lovecraft's That's What Friends Are For, which is a First Time story. They have no tags in common, and they are in different categories. I believe that is Lovecraft's most popular story, in terms of views and favorites. It also has a high score. My story has far fewer views but it's got a high number of views and favorites for an Exhibitionist & Voyeur story. Its score, however, is only 4.45, and with the exception of one brief period last summer it has never crossed the 4.5 threshold in four and a half years since its publication.

I suspect the reason it's on the list is favorites: Lovecraft and I both write a lot of incest stories, so we probably have many readers in common. Many of the same people who favorited my story probably favorited his story. This makes the most sense to me.
 
For the latest examples I've looked at, the "similar to" listing has mostly been back to my own stories. That's OK with me.
 
My theory is that the algorithm initially is based on tags when a story has no favorites yet, but that this pattern gives way to a common-favorites based algorithm over time.

1. One way to gather more information on this would be to look at the favorites of one's own story immediately after publication, before it has gathered any favorites, and to see how and in what way its Similar Stories list changes over time.

2. I looked at my story Late Night on the Loveseat with Mom, which was published over four years ago and is my most favorited story. It's a mother-son incest/taboo story. One of its "Similar Stories" is The Sleepover by samslam. That is a brother-sister incest story, and it shares no common tags with my story.

3. My most-viewed Exhibitionist & Voyeur story is A Bikini With A Mind Of Its Own. One of the stories on its Similar Stories list is Lovecraft's That's What Friends Are For, which is a First Time story. They have no tags in common, and they are in different categories. I believe that is Lovecraft's most popular story, in terms of views and favorites. It also has a high score. My story has far fewer views but it's got a high number of views and favorites for an Exhibitionist & Voyeur story. Its score, however, is only 4.45, and with the exception of one brief period last summer it has never crossed the 4.5 threshold in four and a half years since its publication.

I suspect the reason it's on the list is favorites: Lovecraft and I both write a lot of incest stories, so we probably have many readers in common. Many of the same people who favorited my story probably favorited his story. This makes the most sense to me.

These lists have always been hit or miss. Sometimes I'll see stories next to one of mine, or something I've read that make sense as far as genre or at least a word in the title, other times I've had GM stories show up and I've never read or published one here.

They seem to be getting a bit better, but I'd be hard pressed to be able to determine what they're using for these things. Your mention of my FT story on your E/V story makes me wonder if its because we've favorited each other as authors? Maybe read some of each others work?
 
Here's a question I don't want to bother starting a thread on, but seeing the stat gurus come here...

Is there a list somewhere for stories with the most comments?

I've found most productive by poking around and even the members who have left the most comments (top 15 I think) but anything for comments?
 
They seem to be getting a bit better, but I'd be hard pressed to be able to determine what they're using for these things. Your mention of my FT story on your E/V story makes me wonder if its because we've favorited each other as authors? Maybe read some of each others work?

I think the explanation is this. We both write incest stories, and in particular we both write mom-son incest stories. We both have a high number of followers, although you obviously have many more than I do. You're number 4 on the list and I'm number 92. Our mom-son incest stories draw a high number of views.

So it stands to reason that there would be significant overlap in our readership pools, AND that it's by Literotica standards a fairly high number of readers. Those readership pools may not limit themselves to just mom-son incest stories. Somebody checks out my mom-son incest stories, and then they decide to read my Bikini story, and favorite it, and then they wander over to your story portfolio to read the mom-son stuff, and they happen to find TWFAF. They favorite both Bikini and TWFAF. The number of people who have favorited both of these stories exceeds the number of common favorites that Bikini has with most other stories. Voila, your sorry ends up on my story's Similar Stories list.

I think this is what's going on.
 
On a similar note, there's a big (in views/popularty) mom/son story posted today, with the Lap theme. Those always get tons of views.

A tag they used was 'party'

As a result, a story of mine from several years ago was in the Related list. I also used the tag 'party'

Overnight I got 5 new favorites for my old story, from having just been on the Related list.
 
Back
Top