python – Notes of a Neuropsychiatry Amateur

Percentile Confidence Interval Calculator

Percentile-Confidence-Interval-Calculation.ipynb

This Python script calculates the 95% confidence interval for a specified percentile (e.g., the 70th percentile) of a dataset. The confidence interval provides a range in which we expect the true percentile value to lie with 95% confidence.

The calculation makes use of the binomial distribution properties, making an assumption that our data can be modeled by a binomial distribution. This assumption may not always be accurate, especially for continuous data, but it provides an approximation for our purposes.

Assumptions

1. Binary Outcome: The fundamental assumption behind the binomial distribution is that there is a binary outcome, often termed as ‘success’ and ‘failure’. In the context of percentiles, you can think of ‘success’ as the instances below the percentile and ‘failure’ as the instances above.

2. Fixed Number of Trials: For the binomial distribution, there is a fixed number n of trials. In our case, n represents the total number of data points in our sample.

3. Independence: Each trial (or data point) is independent of others. This means the outcome of one trial does not affect the outcome of another.

4. Constant Probability of Success: The probability of success, q, is the same for each trial. Here, q represents the percentile value. For example, for the 70th percentile, q=0.7.

Why the Binomial Distribution?

The rationale behind using the binomial distribution for percentile confidence intervals is its direct applicability to cases where you’re looking at the proportion of observations below a certain threshold (i.e., a percentile).

When you’re asking about the 70th percentile, you’re essentially inquiring: “What’s the value below which 70% of my data falls?” This can be likened to asking about the number of successes in n trials, where a success is an observation below the desired threshold.

However, it’s important to note that this method provides an approximation. The binomial distribution is discrete and inherently based on counting successes in a set number of trials, while percentiles often come from continuous distributions and may not perfectly adhere to the assumptions above.

import numpy as np
from scipy.stats import binom
import seaborn as sns

Get some data

# Load the Iris dataset
iris = sns.load_dataset("iris")
# Use the 'sepal_length' feature
data = iris['sepal_length'].values

print(data[:50])

[5.1 4.9 4.7 4.6 5.  5.4 4.6 5.  4.4 4.9 5.4 4.8 4.8 4.3 5.8 5.7 5.4 5.1
 5.7 5.1 5.4 5.1 4.6 5.1 4.8 5.  5.  5.2 5.2 4.7 4.8 5.4 5.2 5.5 4.9 5.
 5.5 4.9 4.4 5.1 5.  4.5 4.4 5.  5.1 4.8 5.1 4.6 5.3 5. ]

Calculate the 70th percentile

# Calculate the 70th percentile
percentile_70 = np.percentile(data, 70)
print("Min: %f, Max: %f, 70th percentile: %f" % (min(data), max(data), percentile_70))

Min: 4.300000, Max: 7.900000, 70th percentile: 6.300000

Convert the data to “success” (above the 70th percentile) and “failure”

successes = np.sum(data > percentile_70)
failures = len(data) - successes

# Now, `successes` is analogous to `q * n` in the binomial scenario.
# So, we can set:
n = len(data)
q = successes / n

print("n: %d, q: %f" % (n, q))

n: 150, q: 0.280000

Calculate the 95% confidence interval

The code calculates potential upper (u) and lower (l) bounds for a confidence interval using the binomial distribution’s percent-point function (ppf).

np.ceil(binom.ppf(1 – alpha / 2, n, q)) determines the approximate upper bound for the confidence interval and np.ceil(binom.ppf(alpha / 2, n, q)) for the lower bound.

+ np.arange(-2, 3) extends these bounds by adding an array of [-2, -1, 0, 1, 2], generating a set of potential boundaries around the original estimate.

u gives a sequence of indices in the dataset that demarcate the upper bound of the confidence interval. It starts from the calculated index for the 97.5th percentile and provides two more indices above and two below it.

l gives a sequence of indices in the dataset that demarcate the lower bound of the confidence interval. It starts from the calculated index for the 2.5th percentile and provides two more indices above and two below it.

alpha = 0.05
u = np.ceil(binom.ppf(1 - alpha / 2, n, q)) + np.arange(-2, 3)
u[u > n] = np.inf

l = np.ceil(binom.ppf(alpha / 2, n, q)) + np.arange(-2, 3)
l[l < 0] = -np.inf

print("u: " + ", ".join(map(str, u)))
print("l: " + ", ".join(map(str, l)))

u: 51.0, 52.0, 53.0, 54.0, 55.0
l: 29.0, 30.0, 31.0, 32.0, 33.0

sorted_data = np.sort(data)

# Extract values corresponding to the indices
# Correct way to interpret the u and l values
u_values = sorted_data[n - u.astype(int)]
l_values = sorted_data[l.astype(int) - 1]

print("Upper values:", u_values)
print("Lower values:", l_values)

Upper values: [6.3 6.2 6.2 6.2 6.2]
Lower values: [5.  5.  5.  5.  5.1]

Probability coverage

The code calculates the probability coverage of different combinations of potential confidence intervals formed by the lower bounds (l) and upper bounds (u). Coverage is a matrix of probabilities. The goal is to find the smallest confidence interval that guarantees coverage of at least 1−α.

coverage = np.zeros((len(l), len(u)))
for i, a in enumerate(l):
    for j, b in enumerate(u):
        coverage[i, j] = binom.cdf(b - 1, n, q) - binom.cdf(a - 1, n, q)

if np.max(coverage) < 1 - alpha:
    i = np.where(coverage == np.max(coverage))
else:
    i = np.where(coverage == np.min(coverage[coverage >= 1 - alpha]))

print("Coverage Matrix:")
print(coverage)

print("\nOptimal Indices (i_l, i_u):")
print(i)

Coverage Matrix:
[[0.93135214 0.95028522 0.96430299 0.97438285 0.98142424]
 [0.92730647 0.94623955 0.96025732 0.97033718 0.97737857]
 [0.92096076 0.93989385 0.95391161 0.96399148 0.97103286]
 [0.91140808 0.93034117 0.94435894 0.9544388  0.96148018]
 [0.89759319 0.91652627 0.93054404 0.9406239  0.94766529]]

Optimal Indices (i_l, i_u):
(array([0], dtype=int64), array([1], dtype=int64))

i_l = i[0][0]
i_u = i[1][0]
print("Chosen row of coverage matrix: %d, chosen column of coverage matrix: %d" % (i_l, i_u))

u_final = min(n, u[i_u])
u_final = max(0, int(u_final)-1)
        
l_final = min(n, l[i_l])
l_final = max(0, int(l_final)-1)

# Actual value corresponding to u_final and l_final
upper_value_threshold = n - u_final
lower_value_threshold = l_final

upper_value = sorted_data[upper_value_threshold]
lower_value = sorted_data[lower_value_threshold]

print("Lower bound value:", lower_value)
print("Upper bound value:", upper_value)

Chosen row of coverage matrix: 0, chosen column of coverage matrix: 1
Lower bound value: 5.0
Upper bound value: 6.3

import matplotlib.pyplot as plt

# Plotting the histogram
plt.figure(figsize=(10, 6))
plt.hist(data, bins=30, color='skyblue', edgecolor='black', alpha=0.7, label='Data')

# Adding vertical lines for lower_value and upper_value
plt.axvline(lower_value, color='red', linestyle='--', label='Lower bound')
plt.axvline(upper_value, color='green', linestyle='--', label='Upper bound')

# Adding vertical line for the 70th percentile
plt.axvline(percentile_70, color='purple', linestyle='-.', label='70th Percentile')

# Adding title and labels
plt.title('Histogram of Data with Confidence Bounds and 70th Percentile')
plt.xlabel('Value')
plt.ylabel('Frequency')
plt.legend()

plt.show()

Bootstrap method

A commonly used alternative method to calculate confidence intervals for percentiles (also known as quantiles) is the Bootstrap method.

The Bootstrap method involves resampling the dataset multiple times with replacement and then computing the desired statistic (in this case, the 70th percentile) for each of these resampled datasets. This gives a distribution of the 70th percentiles from which we can compute the confidence interval.

lower: This represents the value below which the bottom 2.5% of your jotted down 70th percentiles fall. In other words, it’s like saying, “In 2.5% of our bootstrap ‘experiments,’ the 70th percentile was below this value.”

upper: This is the value below which the bottom 97.5% of your jotted down 70th percentiles fall. Put another way, “In 97.5% of our bootstrap ‘experiments,’ the 70th percentile was below this value.”

import numpy as np

def bootstrap_percentile_CI(data, percentile=70, alpha=0.05, B=10000):
    """Calculate the bootstrap confidence interval for a given percentile."""
    n = len(data)
    resampled_percentiles = []

    for _ in range(B):
        resample = np.random.choice(data, n, replace=True)
        resampled_percentiles.append(np.percentile(resample, percentile))

    lower = np.percentile(resampled_percentiles, 100 * alpha/2)
    upper = np.percentile(resampled_percentiles, 100 * (1-alpha/2))
    
    return lower, upper

# Calculate the bootstrap 70th percentile confidence interval
lower_bootstrap, upper_bootstrap = bootstrap_percentile_CI(data)
print("Bootstrap 70th percentile CI: (%.2f, %.2f)" % (lower_bootstrap, upper_bootstrap))

Bootstrap 70th percentile CI: (6.10, 6.43)

# Plotting
plt.hist(data, bins=30, color='lightblue', edgecolor='black', alpha=0.7)
plt.axvline(x=np.percentile(data, 70), color='green', linestyle='--', label="True 70th Percentile")
plt.axvline(x=lower_bootstrap, color='red', linestyle='--', label="Lower Bound of CI")
plt.axvline(x=upper_bootstrap, color='blue', linestyle='--', label="Upper Bound of CI")
plt.legend()
plt.title('Histogram of Sepal Length with Bootstrap CI for 70th Percentile')
plt.xlabel('Sepal Length')
plt.ylabel('Frequency')
plt.show()

Discussion

The bootstrap method makes minimal assumptions about the distribution of the data, making it versatile for a wide variety of datasets. This flexibility allows the bootstrap to handle complex or unknown data distributions, whereas the binomial method assumes data follows a binomial distribution and is mainly suited for binary outcomes. While the binomial approach is computationally simpler and quicker, it might not always provide an accurate representation, especially if the underlying assumptions aren’t met. In contrast, the bootstrap can be more computationally intensive due to resampling but offers the advantage of being more adaptable and often provides a more accurate estimate for datasets that don’t strictly adhere to a binomial distribution.

Summarizing articles on PMDD treatments using TextRank

In this blog post, I want to share with you what I learned about treating PMDD using articles summarization through TextRank. TextRank is not really a summarization algorithm, it is used for extracting top sentences, but I decided to use it anyways and see the results. I started by using the googlesearch library in python to search for “PMDD treatments – calcium, hormones, SSRIs, scientific evidence”. The search resulted in a list of URLs to various articles on PMDD treatments. However, not all of them were useful for my purposes, as some were blocked due to access restrictions. I used BeautifulSoup to extract the text from the remaining articles.

In order to exclude irrelevant paragraphs, I used the library called Justext. This library is designed for removing boilerplate content and other non-relevant text from HTML pages. Justext uses a heuristics to determine which parts of the page are boilerplate and which are not, and then filters out the former. Justext tries to identify these sections by analyzing the length of the text, the density of links, and the presence of certain HTML tags.

Some examples of the kinds of content that Justext can remove include navigation menus, copyright statements, disclaimers, and other non-content-related text. It does not work perfectly, as I still ended up with sentences such as the following in the resulting articles: “This content is owned by the AAFP. A person viewing it online may make one printout of the material and may use that printout only for his or her personal, non-commercial reference.”

Next, I used existing code that implements the TextRank algorithm that I found online. I slightly improved it so that instead of bag of words method the algorithm would use sentence embeddings. Let’s go step by step through the algorithm. I defined a class called TextRank4Sentences. Here is a description of each line in the __init__ method of this class:

self.damping = 0.85: This sets the damping coefficient used in the TextRank algorithm to 0.85. In this case, it determines the probability of the algorithm to transition from one sentence to another.

self.min_diff = 1e-5: This sets the convergence threshold. The algorithm will stop iterating when the difference between the PageRank scores of two consecutive iterations is less than this value.

self.steps = 100: This sets the number of iterations to run the algorithm before stopping.

self.text_str = None: This initializes a variable to store the input text.

self.sentences = None: This initializes a variable to store the individual sentences of the input text.

self.pr_vector = None: This initializes a variable to store the TextRank scores for each sentence in the input text.

from nltk import sent_tokenize, word_tokenize
from nltk.cluster.util import cosine_distance
from sklearn.metrics.pairwise import cosine_similarity

from sentence_transformers import SentenceTransformer

model = SentenceTransformer('distilbert-base-nli-stsb-mean-tokens')

MULTIPLE_WHITESPACE_PATTERN = re.compile(r"\s+", re.UNICODE)

class TextRank4Sentences():
    def __init__(self):
        self.damping = 0.85  # damping coefficient, usually is .85
        self.min_diff = 1e-5  # convergence threshold
        self.steps = 100  # iteration steps
        self.text_str = None
        self.sentences = None
        self.pr_vector = None

The next step is defining a private method _sentence_similarity() which takes in two sentences and returns their cosine similarity using a pre-trained model. The method encodes each sentence into a vector using the pre-trained model and then calculates the cosine similarity between the two vectors using another function core_cosine_similarity().

core_cosine_similarity() is a separate function that measures the cosine similarity between two vectors. It takes in two vectors as inputs and returns a similarity score between 0 and 1. The function uses the cosine_similarity() function from the sklearn library to calculate the similarity score. The cosine similarity is a measure of the similarity between two non-zero vectors of an inner product space. It is calculated as the cosine of the angle between the two vectors.

Mathematically, given two vectors u and v, the cosine similarity is defined as:

cosine_similarity(u, v) = (u . v) / (||u|| ||v||)

where u . v is the dot product of u and v, and ||u|| and ||v|| are the magnitudes of u and v respectively.

def core_cosine_similarity(vector1, vector2):
    """
    measure cosine similarity between two vectors
    :param vector1:
    :param vector2:
    :return: 0 < cosine similarity value < 1
    """
    sim_score = cosine_similarity(vector1, vector2)
    return sim_score

class TextRank4Sentences():
    def __init__(self):
        ...

    def _sentence_similarity(self, sent1, sent2):
        first_sent_embedding = model.encode([sent1])
        second_sent_embedding = model.encode([sent2])
        
        return core_cosine_similarity(first_sent_embedding, second_sent_embedding)

In the next function, the similarity matrix is built for the given sentences. The function _build_similarity_matrix takes a list of sentences as input and creates an empty similarity matrix sm with dimensions len(sentences) x len(sentences). Then, for each sentence in the list, the function computes its similarity with all other sentences in the list using the _sentence_similarity function. After calculating the similarity scores for all sentence pairs, the function get_symmetric_matrix is used to make the similarity matrix symmetric.

The function get_symmetric_matrix adds the transpose of the matrix to itself, and then subtracts the diagonal elements of the original matrix. In other words, for each element (i, j) of the input matrix, the corresponding element (j, i) is added to it to make it symmetric. However, the diagonal elements (i, i) of the original matrix are not added twice, so they need to be subtracted once from the sum of the corresponding elements in the upper and lower triangles. The resulting matrix has the same values in the upper and lower triangles, and is symmetric along its main diagonal. The similarity matrix is made symmetric in order to ensure that the similarity score between two sentences in the matrix is the same regardless of their order, and it also simplifies the computation.

def get_symmetric_matrix(matrix):
    """
    Get Symmetric matrix
    :param matrix:
    :return: matrix
    """
    return matrix + matrix.T - np.diag(matrix.diagonal())

class TextRank4Sentences():
    def __init__(self):
        ...

    def _sentence_similarity(self, sent1, sent2):
        ...
    
    def _build_similarity_matrix(self, sentences, stopwords=None):
        # create an empty similarity matrix
        sm = np.zeros([len(sentences), len(sentences)])
    
        for idx, sentence in enumerate(sentences):
            print("Current location: %d" % idx)
            sm[idx] = self._sentence_similarity(sentence, sentences)
    
        # Get Symmeric matrix
        sm = get_symmetric_matrix(sm)
    
        # Normalize matrix by column
        norm = np.sum(sm, axis=0)
        sm_norm = np.divide(sm, norm, where=norm != 0)  # this is ignore the 0 element in norm
    
        return sm_norm

In the next function, the ranking algorithm PageRank is implemented to calculate the importance of each sentence in the document. The similarity matrix created in the previous step is used as the basis for the PageRank algorithm. The function takes the similarity matrix as input and initializes the pagerank vector with a value of 1 for each sentence.

In each iteration, the pagerank vector is updated based on the similarity matrix and damping coefficient. The damping coefficient represents the probability of continuing to another sentence at random, rather than following a link from the current sentence. The algorithm continues to iterate until either the maximum number of steps is reached or the difference between the current and previous pagerank vector is less than a threshold value. Finally, the function returns the pagerank vector, which represents the importance score for each sentence.

class TextRank4Sentences():
    def __init__(self):
        ...

    def _sentence_similarity(self, sent1, sent2):
        ...
    
    def _build_similarity_matrix(self, sentences, stopwords=None):
        ...

    def _run_page_rank(self, similarity_matrix):

        pr_vector = np.array([1] * len(similarity_matrix))

        # Iteration
        previous_pr = 0
        for epoch in range(self.steps):
            pr_vector = (1 - self.damping) + self.damping * np.matmul(similarity_matrix, pr_vector)
            if abs(previous_pr - sum(pr_vector)) < self.min_diff:
                break
            else:
                previous_pr = sum(pr_vector)

        return pr_vector

The _get_sentence function takes an index as input and returns the corresponding sentence from the list of sentences. If the index is out of range, it returns an empty string. This function is used later in the class to get the highest ranked sentences.

class TextRank4Sentences():
    def __init__(self):
        ...

    def _sentence_similarity(self, sent1, sent2):
        ...
    
    def _build_similarity_matrix(self, sentences, stopwords=None):
        ...

    def _run_page_rank(self, similarity_matrix):
        ...

    def _get_sentence(self, index):

        try:
            return self.sentences[index]
        except IndexError:
            return ""

The code then defines a method called get_top_sentences which returns a summary of the most important sentences in a document. The method takes two optional arguments: number (default=5) specifies the maximum number of sentences to include in the summary, and similarity_threshold (default=0.5) specifies the minimum similarity score between two sentences that should be considered “too similar” to include in the summary.

The method first initializes an empty list called top_sentences to hold the selected sentences. It then checks if a pr_vector attribute has been computed for the document. If the pr_vector exists, it sorts the indices of the sentences in descending order based on their PageRank scores and saves them in the sorted_pr variable.

It then iterates through the sentences in sorted_pr, starting from the one with the highest PageRank score. For each sentence, it removes any extra whitespace, replaces newlines with spaces, and checks if it is too similar to any of the sentences already selected for the summary. If it is not too similar, it adds the sentence to top_sentences. Once the selected sentences are finalized, the method concatenates them into a single string separated by spaces, and returns the summary.

class TextRank4Sentences():
    def __init__(self):
        ...

    def _sentence_similarity(self, sent1, sent2):
        ...
    
    def _build_similarity_matrix(self, sentences, stopwords=None):
        ...

    def _run_page_rank(self, similarity_matrix):
        ...

    def _get_sentence(self, index):
        ...
   
    def get_top_sentences(self, number=5, similarity_threshold=0.5):
        top_sentences = []
    
        if self.pr_vector is not None:
            sorted_pr = np.argsort(self.pr_vector)
            sorted_pr = list(sorted_pr)
            sorted_pr.reverse()
    
            index = 0
            while len(top_sentences) < number and index < len(sorted_pr):
                sent = self.sentences[sorted_pr[index]]
                sent = normalize_whitespace(sent)
                sent = sent.replace('\n', ' ')
    
                # Check if the sentence is too similar to any of the sentences already in top_sentences
                is_similar = False
                for s in top_sentences:
                    sim = self._sentence_similarity(sent, s)
                    if sim > similarity_threshold:
                        is_similar = True
                        break
    
                if not is_similar:
                    top_sentences.append(sent)
    
                index += 1
        
        summary = ' '.join(top_sentences)
        return summary

The _remove_duplicates method takes a list of sentences as input and returns a list of unique sentences, by removing any duplicates in the input list.

class TextRank4Sentences():
    def __init__(self):
        ...

    def _sentence_similarity(self, sent1, sent2):
        ...
    
    def _build_similarity_matrix(self, sentences, stopwords=None):
        ...

    def _run_page_rank(self, similarity_matrix):
        ...

    def _get_sentence(self, index):
        ...
   
    def get_top_sentences(self, number=5, similarity_threshold=0.5):
        ...
    
    def _remove_duplicates(self, sentences):
        seen = set()
        unique_sentences = []
        for sentence in sentences:
            if sentence not in seen:
                seen.add(sentence)
                unique_sentences.append(sentence)
        return unique_sentences

The analyze method takes a string text and a list of stop words stop_words as input. It first creates a unique list of words from the input text by using the set() method and then joins these words into a single string self.full_text.

It then uses the sent_tokenize() method from the nltk library to tokenize the text into sentences and removes duplicate sentences using the _remove_duplicates() method. It also removes sentences that have a word count less than or equal to the fifth percentile of all sentence lengths.

After that, the method calculates a similarity matrix using the _build_similarity_matrix() method, passing in the preprocessed list of sentences and the stop_words list.

Finally, it runs the PageRank algorithm on the similarity matrix using the _run_page_rank() method to obtain a ranking of the sentences based on their importance in the text. This ranking is stored in self.pr_vector.

class TextRank4Sentences():
    ...

    def analyze(self, text, stop_words=None):
        self.text_unique = list(set(text))
        self.full_text = ' '.join(self.text_unique)
        #self.full_text = self.full_text.replace('\n', ' ')
        
        self.sentences = sent_tokenize(self.full_text)
        
        # for i in range(len(self.sentences)):
        #     self.sentences[i] = re.sub(r'[^\w\s$]', '', self.sentences[i])
    
        self.sentences = self._remove_duplicates(self.sentences)
        
        sent_lengths = [len(sent.split()) for sent in self.sentences]
        fifth_percentile = np.percentile(sent_lengths, 10)
        self.sentences = [sentence for sentence in self.sentences if len(sentence.split()) > fifth_percentile]

        print("Min length: %d, Total number of sentences: %d" % (fifth_percentile, len(self.sentences)) )

        similarity_matrix = self._build_similarity_matrix(self.sentences, stop_words)

        self.pr_vector = self._run_page_rank(similarity_matrix)

In order to find articles, I used the googlesearch library. The code below performs a Google search using the Google Search API provided by the library. It searches for the query “PMDD treatments – calcium, hormones, SSRIs, scientific evidence” and retrieves the top 7 search results.

# summarize articles
import requests
from bs4 import BeautifulSoup
from googlesearch import search
import justext
query = "PMDD treatments - calcium, hormones, SSRIs, scientific evidence"

# perform the google search and retrieve the top 5 search results
top_results = []
for url in search(query, num_results=7):
    top_results.append(url)

In the next part, the code extracts the article text for each of the top search results collected in the previous step. For each URL in the top_results list, the code sends an HTTP GET request to the URL using the requests library. It then uses the justext library to extract the main content of the webpage by removing any boilerplate text (i.e., non-content text).

article_texts = []

# extract the article text for each of the top search results
for url in top_results:
    response = requests.get(url)
    paragraphs = justext.justext(response.content, justext.get_stoplist("English"))
    text = ''
    for paragraph in paragraphs:
        if not paragraph.is_boilerplate:
            text += paragraph.text + '\n'

    if "Your access to PubMed Central has been blocked" not in text:
        article_texts.append(text.strip())
        print(text)
    print('-' * 50)
    
print("Total articles collected: %d" % len(article_texts))

In the final step, the extracted article texts are passed to an instance of the TextRank4Sentences class, which is used to perform text summarization. The output of get_top_sentences() is a list of the top-ranked sentences in the input text, which are considered to be the most important and representative sentences for summarizing the content of the text. This list is stored in the variable summary_text.

# summarize
tr4sh = TextRank4Sentences()
tr4sh.analyze(article_texts)
summary_text = tr4sh.get_top_sentences(15)

Results:
(I did not list irrelevant sentences that appeared in the final results, such as “You will then receive an email that contains a secure link for resetting your password…“)

Total articles collected: 6

There have been at least 15 randomized controlled trials of the use of selective serotonin-reuptake inhibitors (SSRIs) for the treatment of severe premenstrual syndrome (PMS), also called premenstrual dysphoric disorder (PMDD).

It is possible that the irritability/anger/mood swings subtype of PMDD is differentially responsive to treatments that lead to a quick change in ALLO availability or function, for example, symptom-onset SSRI or dutasteride.
* My note: ALLO is allopregnanolone
* My note: Dutasteride is a synthetic 4-azasteroid compound that is a selective inhibitor of both the type 1 and type 2 isoforms of steroid 5 alpha-reductase

From 2 to 10 percent of women of reproductive age have severe distress and dysfunction caused by premenstrual dysphoric disorder, a severe form of premenstrual syndrome.

The rapid efficacy of selective serotonin reuptake inhibitors (SSRIs) in PMDD may be due in part to their ability to increase ALLO levels in the brain and enhance GABAA receptor function with a resulting decrease in anxiety.

Clomipramine, a serotoninergic tricyclic antidepressant that affects the noradrenergic system, in a dosage of 25 to 75 mg per day used during the full cycle or intermittently during the luteal phase, significantly reduced the total symptom complex of PMDD.

Relapse was more likely if a woman stopped sertraline after only 4 months versus 1 year, if she had more severe symptoms prior to treatment and if she had not achieved full symptom remission with sertraline prior to discontinuation.

Women with negative views of themselves and the future caused or exacerbated by PMDD may benefit from cognitive-behavioral therapy. This kind of therapy can enhance self-esteem and interpersonal effectiveness, as well as reduce other symptoms.

Educating patients and their families about the disorder can promote understanding of it and reduce conflict, stress, and symptoms.

Anovulation can also be achieved with the administration of estrogen (transdermal patch, gel, or implant).

In a recent meta-analysis of 15 randomized, placebo-controlled studies of the efficacy of SSRIs in PMDD, it was concluded that SSRIs are an effective and safe first-line therapy and that there is no significant difference in symptom reduction between continuous and intermittent dosing.

Preliminary confirmation of alleviation of PMDD with suppression of ovulation with a GnRH agonist should be obtained prior to hysterectomy.

Sexual side effects, such as reduced libido and inability to reach orgasm, can be troubling and persistent, however, even when dosing is intermittent. * My note: I think this sentence refers to the side-effects of SSRIs

Calculating Confidence Interval for a Percentile

Calculating the confidence interval for a percentile is a crucial step in understanding the variability and the uncertainty around the estimated value. In many real-world applications, the distribution of the data is unknown and this makes it difficult to determine the confidence intervals. In such scenarios, using a binomial distribution can be a viable alternative to estimate the confidence intervals for a percentile.

For instance, let’s consider a variable with 300 data points and we want to calculate the 70th and 90th percentiles and the corresponding confidence intervals for the variable. To do this, we can use a binomial distribution approach.

First, we need to choose an alpha level, which is a probability that determines the size of the confidence interval. A common choice for alpha is 0.05, which corresponds to a 95% confidence interval.

Next, we use the cumulative distribution function (CDF) of the binomial distribution to estimate the lower and upper bounds of the confidence interval. The CDF of the binomial distribution gives the probability of getting k or fewer successes in n independent Bernoulli trials, where the probability of success in each trial is p.

To calculate the 70th percentile and its confidence interval, we use the following steps:

Set n = 300, which is the number of data points.
Set p = 0.7, which corresponds to the 70th percentile.
Calculate the binomial quantile using the CDF, which is the smallest k such that P(X <= k) >= p, where X is a binomial random variable with parameters n and p.
Use the CDF to determine the lower and upper bounds of the confidence interval.

Below is the python code for calculating the confidence interval for the 70th percentile.

alpha – alpha is a parameter representing the significance level or confidence level for the calculation of the confidence interval. It is the probability that the confidence interval contains the true value of the parameter being estimated. The value of alpha is typically set to 0.05 or 0.01, meaning that there is a 95% or 99% chance, respectively, that the confidence interval contains the true value. In the code, alpha=0.05 is the default value for alpha, but it can be changed to a different value if desired.

n – number of observations

q – percentile value

from scipy.stats import binom
import numpy as np

alpha = 0.05
n = 300
q = 0.7

Below is the code for calculating the upper and lower bounds for the confidence interval. The u value is calculated as the ceiling of the binomial distribution’s quantile function (ppf) evaluated at 1 – alpha / 2 (1 – 0.05 / 2 = 0.975), and the value is shifted by adding an array of numbers from -2 to 2. Any values of u that are greater than n are set to infinity.

u = np.ceil(binom.ppf(1 - alpha / 2, n, q)) + np.arange(-2, 3)
u[u > n] = np.inf

l = np.ceil(binom.ppf(alpha / 2, n, q)) + np.arange(-2, 3)
l[l < 0] = -np.inf

# From the calculation of bounds, np.ceil(binom.ppf(1 - alpha / 2, n, q)) and np.ceil(binom.ppf(alpha / 2, n, q)), we obtain that
# the upper bound value is 225 and the lower bound value is 194. This means that given a sample of size 300, a binomial distribution, and # probability of success p=0.7, we are 95% certain that the number of successes will be between 194 and 225.

Next we calculate coverage of the percentiles that the bounds cover. The coverage represents a matrix of values that correspond to the probability of coverage of the confidence interval for each combination of lower and upper bounds of the interval.

The coverage calculation uses the binom.cdf function to calculate the cumulative distribution function (CDF) for the binomial distribution, which is then used to determine the coverage probability of each combination of u and l. Once the coverage matrix is calculated, the code finds the index i corresponding to the combination of u and l that gives the closest coverage probability to 1-alpha.

coverage = np.zeros((len(l), len(u)))

for i, a in enumerate(l):
    for j, b in enumerate(u):
        coverage[i, j] = binom.cdf(b - 1, n, q) - binom.cdf(a - 1, n, q)

Next we select the upper and lower bounds of the confidence interval based on the coverage of the interval. The code first checks if the maximum coverage is less than 1 minus the significance level alpha. If it is, the code selects the pair of bounds with the maximum coverage probability. Otherwise, the code selects the pair of bounds with the smallest coverage probability that is still greater than or equal to 1 minus alpha.

if np.max(coverage) < 1 - alpha:
    i = np.where(coverage == np.max(coverage))
else:
    i = np.where(coverage == np.min(coverage[coverage >= 1 - alpha]))

i_u = i[0][0]
i_l = i[1][0]

u_final = min(n, u[i_u])
u_final = max(0, int(u_final)-1)
        
l_final = min(n, l[i_l])
l_final = max(0, int(l_final)-1)

The resulting l and u are 192 and 223, respectively. Therefore if you have a sample of 300 and you want to calculate the confidence interval for a variable X, you would sort the values in ascending order, and then you would take the values of X that correspond to the 192nd and 223rd observations.

Advent of Code Day 5 – my bonus question

I am doing the Advent of Code. So far I have solved all the questions for the four previous days and part one of the question for day five. I have also created my own question for fun, the question is below:

After many hours of walking, the Elves come to a forest glade. They are quite tired and hungry, one of the elves suddenly notices that the glade is full of mushrooms. The Elves are familiar with this mushrooms species – they are edible and quite tasty. The Elves pick all of the mushrooms and are almost ready to make mushroom soup, when they remember about one tricky problem – there is a poisonous mushroom species that looks very similar and often a poisonous mushroom will grow right among the edible mushrooms.

At this point the elves have determined the molecular structure of each mushrooms that they picked. The structure always consists of five segments and each segment consists of a number and a letter.

Example: 0.9H 0.08G 0.27L 0.57M 0.84P

Each letter molecule (A – Z) has a corresponding weight, from 0 to 25. The numbers also represent additional weight units. It is therefore possible to calculate the molecular weight of each mushroom. In the above example the weight would be 0.9 + 7 + 0.08 + 6 + 0.27 + 11 + 0.57 + 12 + 0.84 + 15 = 53.66

If the structure had a negative number, such as if it would be 0.9H -0.08G 0.27L 0.57M 0.84P, then the negative segment would need to be subtracted. The weight then would be 0.9 + 7 – 0.08 – 6 + 0.27 + 11 + 0.57 + 12 + 0.84 + 15 = 41.5

The Elves are aware that the value of each segment of a mushroom comes from a process generated by ~N(12.5, 4.5) and there is no correlation between the segments. (The value of the segment is number + letter, for example 0.9H is 7.9, while -0.08G is -6.08).

The mushroom that is poisonous is definitely tricky to find for the Elves because it looks exactly the same as the edible mushrooms. BUT! The molecular structure of this mushroom gives it away! It is very unlikely that such structure would be generated by the same process as for the edible mushrooms. Find the poisonous mushroom from the input list so that the Elves can start cooking their soup.

The list of mushrooms is in the link below:

Advent of Code Day 5 bonus question input

Reddit Depression Regimens cont’d

Previous posts on the topic of scraping reddit data from the depressionregiments subreddit:

Reddit Depression Regimens – Topic Modeling

Reddit Depression Regimens – Topic Modeling cont’d

Next we will create some plots with javascript. For example, it would be interesting to see how often specific psychotropic medications and supplements are mentioned in the text data.
Below is a chart with frequencies of the most common antidepressant medications. The counts were performed by combining the frequencies of the brand name and the chemical name (for example Wellbutrin count is wellbutrin (54) + bupropion (27) = 81).

The data was generated using python and exported as a .csv file, with columns ‘term’ and ‘freq’.

HTML part:

<html>
<head>
  https://cdn.plot.ly/plotly-2.0.0.min.js
  https://d3js.org/d3.v5.min.js
  https://cdn.jsdelivr.net/npm/chart.js@2.9.3
  http://script1.js
</head>
<body onload="draw()">
chart 1
<div id="jsdiv" style="border:solid 1px red"></div>
chart 2
<canvas id="chart"></canvas>
</body>

JS part:

function makeChart(meds) {
  // meds is an array of objects where each object is something like

  var hist_labels = meds.map(function(d) {
    return d.term;
  });
  var hist_counts = meds.map(function(d) {
    return +d.freq;
  });

  arrayOfObj = hist_labels.map(function(d, i) {
      return {
        label: d,
        data: hist_counts[i] || 0
      };
    });
  sortedArrayOfObj = arrayOfObj.sort(function(a, b) {
      return b.data - a.data;
    });

   newArrayLabel = [];
   newArrayData = [];
   sortedArrayOfObj.forEach(function(d){
      newArrayLabel.push(d.label);
      newArrayData.push(d.data);
    });


  var chart = new Chart('chart', {
    type: "horizontalBar",
    options: {
      maintainAspectRatio: false,
      legend: {
        display: false
      }
    },
    data: {
      labels: newArrayLabel,
      datasets: [
        {
          data: newArrayData,
          backgroundColor: "#33AEEF"
        }]
    },
    options: {
      scales: {
        yAxes: [{
          scaleLabel: {
            display: true,
            labelString: 'med name'
          }
        }],
        xAxes: [{
            scaleLabel: {
                display: true,
                labelString: 'freq'
            }
        }],
      },
      legend: {
          display: false
      },
      title: {
          display: true,
          text: 'Frequencies of common antidepressants'
        }
    }    
  });
}

// Request data using D3
d3
  .csv("med_list_counts_df.csv")
  .then(makeChart);

We can generate charts with other medication/supplement lists using the same code. Below is a plot with frequencies of common antipsychotics. As you can see, antipsychotics are not mentioned that frequently as antidepressants, and a lot of names in the input list were not mentioned at all (such as haldol or thorazine), and therefore they do not show up in the chart.

Other medications and common supplements mentioned:

Reddit Depression Regimens – Topic Modeling cont’d

In the previous posts we applied LDA topic modeling to text documents from data collected from the subreddit depressionregimens. Here I will continue with the results from the derived topics model – obtaining the most representative text for each topic. As was stated, the chosen model has ten topics, and LDA assumes that each document is composed of multiple topics, with each topic being assigned a probability. Each topic is composed of multiple words, with each word assigned a probability.

Previous post: Reddit Depression Regiments – Topic Modeling

Since each document is composed of multiple topics, for each topic we can find a document with the highest probability for that topic, therefore that will be our most representative document.

Topic 1

(‘feel’, 0.040), (‘year’, 0.026), (‘thing’, 0.022), (‘symptom’, 0.020), (‘brain’, 0.019), (‘start’, 0.018), (‘time’, 0.017), (‘make’, 0.015), (‘issue’, 0.015), (‘lot’, 0.014)

Most representative post id with topic 1 probability of 0.45:
Full text here: https://www.reddit.com/r/depressionregimens/comments/gib17h

“Blank Mind Syndrome” – Sub group of specific symptoms including: – Loss of Internal Monologue, lack of coherent automatic thoughts, no track of time passage, lack of self insight – Depersonalisation/Derealization Feeling detached, having no “sense of self”, missing mental features, having no emotional autobiography, feeling as if every day is the same, loss of relationship or emotional attachments, feeling detached from external reality – Cognitive Decline, Loss of Visual imagination, inability to think in a deep or complex way, inability to hold information, loss of past learned skills and knowledge. – Complete Lack of goal-directed motivation, having no automatic self direction, no long term goals – Anhedonia – inability to enjoy or derive pleasure, nothing to look forward to, no bodily joy, satasfaction and so on – Lack of atmosphere/deepness of the outside reality, inability to appreciate beauty, things look flat and insignificant. All symptoms in various severity of course, It’s time to talk, what is this condition exactly, Did you suffer from depression your entire life? Is this episodic? how are you planning to solve it? how did you acquire it? had any professional been aware of it? Is it medication induced? Is there any outside outlet acknowledging this specific phenomena? How much time do you suffer from it? What were you diagnosed with? Was it sudden or progressively? Had anything helped at all? Would you join a group for people suffering the same condition? Is anyone interested in doing so? Please do respond!

Topic 2

people 0.044, depression 0.037, doctor 0.028, psychiatrist 0.020, make 0.020, bad 0.016, therapy 0.016, therapist 0.015, find 0.014, problem 0.013

Most representative post for this topic, with probability for topic 2 of 0.53: https://www .reddit.com/r/depressionregimens/comments/iij4tr

I talked to him today, he says all my problems are my choice and I choose to be lazy, suicidal, depressed etc. Is he right?,Dude… if he believes this then he must also believe that his career is total quackery. Get a new psychiatrist immediately. What a piece of shit.,absolutely not, please get a new psychiatrist!! you don’t choose to be suicidal or depressed, and in my experience, depression causes laziness more often than not. it’s worrisome that a professional outright said this to you and honestly I would report him if you can. that’s such a harmful thing to say to anyone suffering from such issues and to say it to the wrong person could be really catastrophic. i’m sorry he’s a dick to you, don’t listen to his bullshit. if it was so easy as to choose not to be depressed then nobody would fucking be depressed. it’s like he thinks people enjoy feeling this way ugh,OMG please please PLEASE never go back there. I once had a psychiatrist tell me I was gonna end up on a street corner with a sign (spoiler alert: I have a career and own a house). I got up and left and never looked back. Remember that YOU are a huge part of your mental health journey. It’s a collaborative effort between you, your psychiatrist, therapist (if you have one), and any other professional you choose to involve. You always have a say, and if something doesn’t seem right, you don’t have to go along with it. Your feelings are ALWAYS valid—don’t ever let anyone tell you differently. You are not alone in this. So many of us are depressed, anxious, suicidal, attention deficit, bipolar, lazy…these are NOT choices. Who would choose to be this way? There are plenty of helpful professionals out there, just make sure you screen them carefully. I believe in you and wish you well!!! …

Topic 3

day 0.037, thing 0.035, feel 0.033, make 0.024, find 0.017, good 0.016, exercise 0.016, eat 0.013, walk 0.013, lot 0.013

https://www.reddit.com/r/depressionregimens/comments/dztdw9

Topic probability: 0.53

Wanted to share something that I’ve recently found to help when I’m struggling to find motivation to complete basic chores. This one specifically deals with laundry, but it can apply to other tasks as well. If you’re like me, you can have laundry sitting there for weeks not being put away. The mountain of clothing is so overwhelming that I just ignore it all together. I’m also an all-or-nothing person; I just wait and wait until a good day when I’ll have enough energy to get it done. Those days are exceedingly rare, so that mountain of clothes will sit there for a loooong time, stressing me out and depressing me even more. I’m trying to switch my mindset to not feeling like I need to take on such giant tasks all at once. I decided to break up the tasks into smaller ones. For the mixed load of laundry that needed to be put away, I told myself I only need to put away the socks and underwear today. Then tomorrow I put away the shirts. The next day, fold pants, and the next everything else that goes on hangers. These smaller tasks only take like 5-10 minutes each, and it’s satisfying to see the pile of clothes dwindle every day versus sit there ominously for several weeks. If you’re feeling overwhelmed, break up your tasks into very small, easily attainable goals. Go easy on yourself and do what you can for the day. Even just the tiniest amount of progress is a good thing.,great advice. Anytime you get anxiety over a task or a situation seems to complex or overwhelming. Just break in down into manageable pieces. Doing SOMETHING is always better than nothing even if it seems like too little or not enough or w/e.,I saw a meme about ‘anything worth doing is worth doing badly’ that addresses this. I try and remember that some days. Us perfectionists want to always do 100%. But in a lot of things (not everything, obviously, just as a general rule) doing 50% of the job, or 90% of the job, is way better then the 0% of the job we do because of that crippling dedication to doing 100%. Not an excuse for doing bad jobs on the stuff that really matters, but can be a much healthier way to approach doing general day-to-day stuff…

Topic 4

ssris 0.027, antidepressant 0.024, effect 0.024, drug 0.022, side_effect 0.020, depression 0.019, serotonin 0.016, prescribe 0.014, treat 0.013, ssri 0.012

Reddit post: https://www.reddit.com/r/depressionregimens/comments/bheg7d

Topic probability: 0.64

Hey y’all, this is a repost of the stickied post made by /u/jugglerofworlds, who appears to have deleted their account and their post along with it. I’ve edited it a little and will continue to keep it updated as needed. Suggestions are welcome. As the former post was, I’m trying to keep this confined to prescription medications, and not natural/herbal remedies (though I recognize that they definitely can be helpful means of treatment). I’m also typically avoiding medications that have been withdrawn from the market and thus aren’t really prescribed. In a future revision of this post I hope to add an additional column featuring which medications are available where, as some of these are approved in European countries but not in the U.S., and vice versa. # Icon key * ✔️ = approved to treat condition by a regulatory agency (FDA, EMA, ANSM, etc) * ➕ = approved as an adjunct treatment by a regulatory agency, to be used in combination with other medications to treat a condition (may or may not be used off-label as a monotherapy) * 🏷️ = Off label use; widely prescribed for condition but not necessarily rigorously studied for it * ⚠️ = experimental medication; in FDA Phase III trials or pending approval # Selective Serotonin Reuptake Inhibitors (SSRIs) |Generic name|Brand name(s)|Treats depression|Treats anxiety| |:-|:-|:-|:-| |citalopram|Celexa|✔️|🏷️| |escitalopram|Lexapro|✔️|✔️| |fluoxetine|Prozac|✔️|✔️| |fluvoxamine|Luvox/Luvox CR|✔️|✔️| |paroxetine|Paxil/Paxil CR|✔️|✔️| |sertraline|Zoloft|✔️|✔️| # Serotonin Modulator and Stimulators (SMS) |Generic name|Brand name(s)|Treats depression|Treats anxiety| |:-|:-|:-|:-| |vortioxetine|Trintellix|✔️|🏷️| |vilazodone|Viibryd|✔️|🏷️| # Serotonin-Norepinephrine Reuptake Inhibitors (SNRIs) |Generic name|Brand name(s)|Treats depression|Treats anxiety| |:-|:-|:-|:-| |venlafaxine|Effexor/Effexor XR|✔️|✔️| |desvenlafaxine|Pristiq|✔️|🏷️| |duloxetine|Cymbalta|✔️|✔️| |milnacipran|Savella|✔️|✔️| |levomilnacipran|Fetzima|✔️|🏷️| |atomoxetine|Strattera|⚠️|⚠️| # Tricyclics (TCAs) ## TCAs with a preference for serotonin |Generic name|Brand name(s)|Treats depression|Treats anxiety|…

Topic 5

treatment 0.035, ketamine 0.028, year 0.022, work 0.021, drug 0.017, hope 0.015, hear 0.012, lithium 0.011, people 0.010, infusion 0.009

Reddit post: https://www.reddit.com/r/depressionregimens/comments/axtnj8

Topic probability: 0.58

https://www.washingtonpost.com/health/2019/03/06/biggest-advance-depression-years-fda-approves-novel-treatment-hardest-cases The Food and Drug Administration approved a novel antidepressant late Tuesday for people with depression that does not respond to other treatments — the first in decades to work in a completely new way in the brain. The drug, a nasal spray called esketamine, has been eagerly anticipated by psychiatrists and patient groups as a powerful new tool to fight intractable depression. The spray acts within hours, rather than weeks or months as is typical for current antidepressants, and could offer a lifeline to about 5 million people in the United States with major depressive disorder who haven’t been helped by current treatments. That accounts for about one in three people with depression. “This is undeniably a major advance,” said Jeffrey Lieberman, a Columbia University psychiatrist. But he cautioned much is still unknown about the drug, particularly regarding its long-term use. “Doctors will have to be very judicious and feel their way along,” he said. The label for the drug will carry a black box warning – the most serious safety warning issued by the FDA. It will caution users they could experience sedation and problems with attention, judgment and thinking, and that there’s potential for abuse and suicidal thoughts. People who take esketamine will have to be monitored for at least two hours after receiving a dose to guard against some of these side effects…

Topic 6

work 0.053, anxiety 0.030, mg 0.025, bad 0.020, high 0.020, vitamin 0.018, diet 0.015, supplement 0.014, post 0.012, literally 0.011

Reddit post: https://ww w.reddit.com/r/depressionregimens/comments/alh4r3

Topic probability: 0.52

About 3 or 4 years ago, I developed a severe form of anxiety disorder where it manifested in panic attacks characterized by intense bouts of nausea, gagging, and retching. It didn’t usually get bad enough to get to vomiting, though it did in a few instances (in which I went to the hospital afterwards). My body responds to stress naturally by gagging and nausea. So imagine being anxious all the time but also NAUSEOUS 24/7, and I mean literally 24/7 without any respite. At times I was seriously considering suicide because of how bad I felt all the time every day. The whole thing started I think because I had to present at a large conference with thousands of people in attendance, and I had a very bad experience being insulted by some people at a previous iteration of this conference years ago. I was commuting to work one day (before the conference) and suddenly got this massive bout of nausea where I felt like I was dying. I realized however that this was my body telling me I have stagefright. I expected my nausea to evaporate once I finished speaking, as it usually would have in the past. Except that it didn’t. It stayed, and remained with me for years. I tried everything but avoided antidepressants for the longest time due to the bad rep they get. I tried the following medications: * Ginger – in various forms – for nausea (didn’t work) * Peppermint – in various forms – for nausea (didn’t work) * Ondansetron (zofran) – 4 mg; as needed – for nausea (didn’t work) * Chlordiazepoxide/clidinium bromide (librax) – 5 mg; once daily – for nausea and anxiety (didn’t work) * Pyridoxine/doxylamine (diclectin) – 10 mg pyridoxine, 10 mg doxylamine; 2 tablets at bedtime – for nausea (didn’t work) * Metoclopramide – 1 tablet daily – for nausea (didn’t work) * Domperidone – 10 mg; once daily – for nausea (didn’t work) * Propranolol – 10 mg; twice daily – for anxiety (didn’t work) * Prochlorazapine – 10 mg; twice daily – for nausea (didn’t work) * Lorazepam (Ativan) – 1 mg; 1 tablet at bedtime – for anxiety (didn’t work; just made me really sleepy) * Pantoprazole (Tecta) – 1 tablet daily – for nausea (didn’t work) * Dimenhydrinate (Gravol) – 1 tablet as needed – for nausea (didn’t work) * Nabilone (cesamet) – 0.5 mg as needed – for nausea (worked for nausea but not anxiety, and gave me a really uncomfortable high) * Clomipramine (Anafranil) – 10 mg. once daily – for anxiety (didn’t try properly due to side-effects) I was afraid even of getting out of my own house. I was afraid of meeting people. I was afraid of leaving my own room – the only place where I felt somewhat at ease and the nausea wasn’t THAT bad. The only thing that worked somewhat to relieve the nausea was chewing on things, whether that meant food at mealtimes, or fennel seeds, or sucking on mints/cough drops. So I carried mints and fennel seeds with me at all times no matter where I was – including in the washroom in my own house and even when I wanted to take a shower I had to have them nearby otherwise I would literally throw up in the shower. But these were not long-term cures to my problem and only a short alleviation of the symptoms (and not that effective if I was more anxious than usual). I somehow graduated from university with a degree in neuroscience and fought through this nausea-anxiety for 2 years doing so. My graduation ceremony – which was supposed to be a happy occasion – was marred by constant nausea and me going through at least 3 entire tins of mints because my body handles excitedness the same way as it does for anxiety. Literally nothing was working and I was at my wit’s end. So I went downtown Toronto and bought CBD oil from a dispensary. I only did this because I was literally desperate, even though I had never done any recreational drugs in my life upto that point (except caffeine), and even though I had a horrible experience with nabilone (synthetic THC for cancer patients to reduce their nausea) so I was really kind of anxious about even using that. But it worked…

Reddit Scraper for Depression Regimens – Ngrams

Reddit is a great source of information containing posts about depression treatments, supplements, diets, and nootropics. Since only specific psychotropic medications are prescribed for depression and anxiety and go through clinical trials with large enough sample sizes, for others we only have anecdotal stories from online users. I can’t perform a randomized controlled trial for green tea matcha’s possible antidepressant qualities without a lab and a grant, but we can use natural language processing to at least summarize some information based on user’s reviews of various supplements.

Below are top ngrams (unigrams, bigrams, and trigrams), based on the text from posts and comments from the depressionregimens subreddit. For this data sample only the top posts and top comments were selected. Posts or comments of word length less than three were removed. The data sample consisted of 1,458 documents (each document being a post of a comment). Data cleaning included removing html tags, expanding common contractions, removing newlines and tabs, removing urls, spelling correction (python’s SymSpell), lemmatization, lowercasing, and removing special characters and extra whitespaces. A list of names that included supplements, neurotransmitters, antidepressants, and other psychotropic medications, was created and excluded from spell check, in order to avoid changing these words (for example we don’t want to change ‘ssris’ to ‘saris’, which is what the SymSpell library was doing).

The ngrams were selected such that each ngram appears in less than 70% of the documents. Absolute and relative frequencies were calculated for each ngram. The top unigrams were as follows: get, depression, feel, go, try, thing, day, work, take, make, help, time, good, one, also.

Examples of posts/comments (original text, before data cleaning) with these top unigrams:

I am going to write this down somewhere.. and then take steps to figure out how to work them all away…I do all of these.. The social media/phone time one is the hardest for me. Maybe I’ll invest in one of those timer boxes I can throw it into. Then I’ll have no choice but to be productive and hopefully more creative. My depression always gets so bad around shark week. So hard to sleep and stay asleep. So for a few days out of the month I really don’t have a choice on that one. But it can easily spiral out of control if I’m not putting in constant effort. I am very tired but also wired feeling right now.

Being diagnosed with terminal cancer you will probably die. There are a lot alternatives to treat depression, regular cardio, different therapy methods, drugs and non-drugs treatments (rTMS, ECT etc.), and if you try everything and nothing work, you can survive until a new treatment arrives. Anyway, I read a lot people refusing antidepressant because “side effects”, so I think depression isn’t so bad for them, Because think about this: a guy/girl with terminal cancer will take any treatment on market if he/she can pay, ignoring side effects because she/he want live.

Thanks for sharing – having a particular difficult day today, it’s nice to hear a success story. I’ve researched this in my area, seems quite expensive, hence I’ve not been able to try it, though I’ve wanted to. Has it been that way for you?
Also, I’ve been told several times that those dependent on benzodiazepine medications do not respond as well (or at all) to IV ketamine, so those must be discontinued before infusions. During the 25 years of so many medicines, did you take benzodiazepines at all?
> But I’m stable. I actually know what happiness feels like. And most importantly, I’m alive.
Amazing to read! Thanks again for a real success story. I wish you the very best of continued health and happiness!

The top bigrams were as follows: side effect, every day, make feel, feel well, mental health, long term, year ago, depression anxiety, treatment resistant, treat depression. Below are some post/comment examples with the top bigrams:

Ketamine crushed rumination that I had been trapped in my whole life. Repeating intrusive negative thoughts of the past. Wiping out the massive, crippling fog of depression was wonderful but that side effect of stopping those negative thoughts was life altering. Glad we found it, even if I am approaching 50 years old.

I broke the sleep/ work depression routine by walking at first. Hour long walks at a quick pace, fast enough that it was challenging. Did that for a month or so. I actually managed to lose 5 pounds that first month so there was a nice bonus. It got me thinking my diet needed improving so I cut out fast food as much as I could and starting making lean meals for myself as much as I could. After another month, that “swollen” feeling you describe started to lessen. So two months in, down 12 pounds, I joined a gym but never touched free weight. Just cardio. It was more intense than walking and took a bit to adjust to the new pace. I left a sweaty mess every day. Did that for about 6 months. I was in decent shape. Down about 30 pounds overall. My brain felt clearer and I had more energy. It’s important to isolate the depression, give it less ammunition to use against you. **One way to do that is to not let it use your body against you.**

After trying over 15 different medications and several rounds of Ketamine IV infusions for my severe treatment-resistant depression, I was about to give up. On everything. I saw a couple posts on this group about how some people have had success with Trintellix, so in a last ditch effort in desperation, I talked to my doctor and started it about a month ago. Within a week my life had changed. The existential dread had lifted. I became interested in things again. For the first time since I can remember I wasn’t exhausted in the middle of the day. I had energy. I smiled. I felt some joy. And it has continued and it’s only been getting better. I think what really happened was that it gave me the jumpstart I needed to start a small exercise regimen and care about eating right, which made me feel even better. It did make me extremely nauseous for the first week but it helped to take it with food and then the side effect went away. Thank you to those who shared their experience and I hope maybe this helps someone as well. There is hope, just keep swimming.

The top trigrams were as follows: treatment resistant depression, major depressive disorder, sexual side effect, make feel good, make feel well, mental health issue, get new psychiatrist, severe treatment resistant, stay bed day, time every day. Below are some post/comment examples with the top trigrams:

Speaking from personal experience, the only type of medication that improved my symptoms were the MAOIs.
These are more old school, and more dangerous. But many have said they are a life saver for treatment resistant depression.
Contrary to conventional antidepressants, they don’t just boost serotonin/dopamine/norepinephrine – they also boost a range of neurochemicals such as trace amines like b-phenylethylamine, which themselves promote the release of neurotransmitters.
MAOIs are so powerful that you have to watch your diet and abstain from a whole range of other drugs.

The sexual side effects, tiredness, agitation and added anxiety all pushed me away from SSRIs. I did like being numb though. Except in the genital area… that created a huge depression in itself. Been off for months now.

Wait, you’ve told your psychiatrist about this, and they didn’t do anything? If so, you need to get a new psychiatrist.
I don’t want to make a diagnosis but have you considered the possibility that you might have bipolar depression? SSRIs can cause hypomania and are considered dangerous for patients with BP. That’s why I said a new doctor is in order. Thankfully, there are antidepressants that don’t cause this reaction, as well as mood stabilizers to prevent the crash you talked about.
Lastly, it sounds like you’re also dealing with a lot of stuff from your past. Are you seeing a therapist right now? They can help you work through those memories and deal with the intense emotions you get in a way that makes your life better and not worse.

We can even obtain some four grams: severe treatment resistant depression, job really well respected, amazing job really well, previous alcoholism push man, girl ever meet amazing.
Post/comment examples below. I really enjoyed reading the first story as I have not heard previously about diphenidine and it was interesting to find out about this substance and the user’s experience.

I meant to post about this sooner and regret not doing so, but hopefully it’s helpful to some and doesn’t break any rules I’m not aware of. I know this subreddit has a focus on safe and researched substances and realise that this is an entirely anecdotal report concerning a not very well-researched substance, but I hope it’s not a problem and think it’s valuable information for someone suffering from severe treatment-resistant depression.
Back in 2015, my husband (23 years old, weight 62 kg) had been feeling severely depressed with suicidal ideation for several weeks. It got to the point where I felt I had to either call in the mental-health people (whom I knew from previous experience to be quite inept) or take a drastic pharmacological measure.
I had read about the rapid and long-lasting antidepressant responses to NMDA-receptor antagonists like ketamine before, and acquired samples of two of ones that are orally active (diphenidine, as well as methoxphenidine, also known as MXP).
NMDA-receptor antagonists appear to produce their antidepressant effects by causing an increase in levels of brain-derived neurotrophic factor (BDNF) that can last for days or weeks following a single dose, whereas the most commonly used antidepressants produce a similar increase in BDNF only after weeks of continuous administration, while also causing many side effects.
Neither of us had ever used any kind of dissociative before, just classical psychedelics, stimulants and marijuana (while visiting a country where that’s legal), so, given his fragile psychological state, I wanted to start with a very careful small dose.
Looking at people’s comments on diphenidine and methoxphenidine online, I couldn’t find anything related to attempts at therapeutic use, nor a clear consensus on a preference for either one. I ended up looking up dosage information for diphenidine, and read that 50 mg was considered a threshold dose.
I first gave him 10 mg of diphenidine in a capsule the first time to be safe; as expected, that had no noticeable effects.
2 hours later I gave him another 20 mg, which still led to no noticeable effects, except possibly a very mild numbing of the senses.
Another 2 hours later I gave him another 30 mg. About 15-20 minutes after this, he reported that he was maybe starting to feel slight derealisation effects.
Until this point he had been playing Skyrim to try to take his mind off his bad feelings; he really wasn’t expecting this to work at all, but he trusted my knowledge of drugs and figured it couldn’t hurt to at least try it.
When the effects started to set in, I told him I’d read that some people like to lie in bed while on drugs like this, and he did so.
His mood didn’t seem much changed, but after lying in bed for a bit, he started talking to me about some of the things that had been bothering him. He sounded sad while talking about these things, but I tried to steer the conversation toward solutions that we could decide on that would make life more satisfying for him.
After chatting for a bit, he seemed to be getting somewhat amused by the effects of the drug; he said things he touched felt very different, and everything felt strange, but not in a bad way.
As we talked some more about his issues, his mood slowly lifted (I think this was around the peak of the experience, which lasted a good portion of the day), and suddenly he got a little smile on his face and said that he was starting to feel… happy. Of course this made me really happy.
He started saying how things felt “solid”, “thick”, “real” and “tangible”, in contrast not only to the way things normally felt but also to the way things had been feeling to him particularly during his weeks of feeling depressed. He related this more solid experience of physical objects to an improved outlook on life.
Interestingly, despite diphenidine being a dissociative drug, it appears to have triggered a reversal of symptoms of dissociation/derealisation that accompanied his depression prior to the treatment.
He said he kind of felt similar to being very drunk, I assume in relation to physical coordination.
He also reported feeling significant time dilation, “in a good way”. (He contrasted this with the time dilation he feels on classical psychedelics, which he tends to find uncomfortable or scary, as though a moment will last forever.)
He then seemed to get a big urge to get up and do lots of tidying and cleaning around our apartment, and he started doing so; I helped. We folded clothes, organised the living room, cleaned the kitchen, stuff like that.
He said that he felt like everything was being put in its place again, both physically and mentally; that his mind was tidy again.
Around this point, he seemed to have this constant feeling of awe at how content he was feeling with life. This wasn’t some kind of unnatural euphoria, just a very strong feeling of contentness, which had obviously been missing from his life for a long time.
Several times, he seemed to have tears in his eyes in awe of how at peace he felt with the world.
Seeing someone emerge from such a deep depression in a matter of hours was really beautiful.

Several times, he hugged me and told me how grateful he was to me for finding this drug for him.
I imagine the talking was therapeutic (which could also have happened without the drug, but was, I imagine, stimulated by it), but mainly I’m certain the drug caused a biochemical change in his brain that has reversed, at least for a time, the natural process that makes him prone to feeling depressed all the time.
The dissociative effects did not fully diminish until he slept; he had no trouble sleeping.

Two days later I asked him how he was feeling, and he smiled and said he was feeling just fine.
More than two weeks later, his depression still had not returned.
This was a massive change. It seems diphenidine can be a powerful medicine. 🙂

He later took it again, this time at 60 mg in one go (about 1 mg/kg), and he felt that this reinforced the antidepressant effects, and that repeating this every few weeks would probably keep him happy in the long term, and the interval we settled on was one dose every 12 days (taken right after waking up to avoid impacting sleep the next night).
In the 5 years that followed, he continued to benefit enormously from diphenidine, and he continues to take it every 12 days. Although after a while there was some tolerance and it no longer led to complete resolution of symptoms, he continues to find it well worth it. The dosage has slowly had to be raised from 60 mg 5 years ago to around 125 mg currently (by about 16% per year) to maintain a similar level of acute effects. We’ve also discovered that adding 200-250 mg of black pepper (which contains piperine, a bioavailability enhancer) in the same capsule makes it a lot more potent.

I wonder how many people commit suicide every year who could have been saved by something like this… granted not a lot of research has been done on using NMDA-receptor antagonists for this indication and there may be unknown risks, but when someone has severe depression that cannot be managed effectively with approved medication or is even ready to commit suicide, I think there’s a very strong case to be made that something like diphenidine should be tried, at least as a last resort.

Of course it’s important to be careful not to use substances like this too frequently, since they have been known to lead to addiction with very frequent use (although, having tried it myself, I personally don’t see how the effects of this particular one could be considered enjoyable by most people). But for my husband, there has been no addiction or any other ill effect over 5 years of regular use.
He is now also taking the MAOI tranylcypromine (Parnate); based on the limited research that has been done, and our experience, there is no interaction between it and diphenidine, although there probably would be with various other dissociatives.

Another example with four gram:

We’ve all been there brother. I lost the best girl I’ve ever met, an amazing job at a really well respected business and a lot of good friends through my previous alcoholism. You just have to push through it man. Even making the tiniest changes in your life will snowball into a world of difference, life always finds a way of working itself out.