R and Python

December 11, 2020   

R and Python

Re.sub, Re.spilt, and count_chars

Something interesting that I have learned in python is using the RegEx package in python to replace characters/symbols/letters/etc of strings to another character, then spliting my strings, and creating a function count_chars to count the number of words in a string. I used these in HW 11 and decided to create my own example based off of Christmas since it is that time of year again!

I made a string with a couple of sentences, and wanted to use re.sub to replace all my punctation marks with <3, just for fun. The <3 is shown down below replacing my puncation marks.

import re
strings=r"Christmas is my favorite holiday of the year. I like having my entire family and friends over. Decorating for christmas and baking goodies is some of my favorite things to do. Watching cheesy christmas movies is always a blast! I would say one of my favorite christmas movies is 'How the Grinch Stole Christmas' (2000) with Jim Carrey as the Grinch and the Polar Express. My Favorite line/scene of the Grinch is: The nerve of those Whos! Inviting me done there, on such short notice! Even if I wanted to go my SCHEDULE Wouldn't allow it! One o'clock, Wallow in self pity. Four thirty, Stare into the abyss. Five o'clock, Solve world hunger; Tell no one. Five thirty, Jazz-ercise.Six thirty, dinner with me. I can't cancle that again! Seven o'clock, wrestle with my self-loathing... I'm booked! Well if I bump the loathing to to nine I could still be done in time to lay in bed and stare at the ceiling and slip slowly into madness...But what would I wear !! "
re.sub("[.,!'()/:;-]",'<3', strings)
## 'Christmas is my favorite holiday of the year<3 I like having my entire family and friends over<3 Decorating for christmas and baking goodies is some of my favorite things to do<3 Watching cheesy christmas movies is always a blast<3 I would say one of my favorite christmas movies is <3How the Grinch Stole Christmas<3 <32000<3 with Jim Carrey as the Grinch and the Polar Express<3 My Favorite line<3scene of the Grinch is<3 The nerve of those Whos<3 Inviting me done there<3 on such short notice<3 Even if I wanted to go my SCHEDULE Wouldn<3t allow it<3 One o<3clock<3 Wallow in self pity<3 Four thirty<3 Stare into the abyss<3 Five o<3clock<3 Solve world hunger<3 Tell no one<3 Five thirty<3 Jazz<3ercise<3Six thirty<3 dinner with me<3 I can<3t cancle that again<3 Seven o<3clock<3 wrestle with my self<3loathing<3<3<3 I<3m booked<3 Well if I bump the loathing to to nine I could still be done in time to lay in bed and stare at the ceiling and slip slowly into madness<3<3<3But what would I wear <3<3 '

Next, I re named my new string that was subsituted to strings1 and then spilt it to prepare for counting individual words in my string.

strings1=re.sub("[.,!'()/:;-]",'<3', strings)
re.split("\\s+", strings1)
## ['Christmas', 'is', 'my', 'favorite', 'holiday', 'of', 'the', 'year<3', 'I', 'like', 'having', 'my', 'entire', 'family', 'and', 'friends', 'over<3', 'Decorating', 'for', 'christmas', 'and', 'baking', 'goodies', 'is', 'some', 'of', 'my', 'favorite', 'things', 'to', 'do<3', 'Watching', 'cheesy', 'christmas', 'movies', 'is', 'always', 'a', 'blast<3', 'I', 'would', 'say', 'one', 'of', 'my', 'favorite', 'christmas', 'movies', 'is', '<3How', 'the', 'Grinch', 'Stole', 'Christmas<3', '<32000<3', 'with', 'Jim', 'Carrey', 'as', 'the', 'Grinch', 'and', 'the', 'Polar', 'Express<3', 'My', 'Favorite', 'line<3scene', 'of', 'the', 'Grinch', 'is<3', 'The', 'nerve', 'of', 'those', 'Whos<3', 'Inviting', 'me', 'done', 'there<3', 'on', 'such', 'short', 'notice<3', 'Even', 'if', 'I', 'wanted', 'to', 'go', 'my', 'SCHEDULE', 'Wouldn<3t', 'allow', 'it<3', 'One', 'o<3clock<3', 'Wallow', 'in', 'self', 'pity<3', 'Four', 'thirty<3', 'Stare', 'into', 'the', 'abyss<3', 'Five', 'o<3clock<3', 'Solve', 'world', 'hunger<3', 'Tell', 'no', 'one<3', 'Five', 'thirty<3', 'Jazz<3ercise<3Six', 'thirty<3', 'dinner', 'with', 'me<3', 'I', 'can<3t', 'cancle', 'that', 'again<3', 'Seven', 'o<3clock<3', 'wrestle', 'with', 'my', 'self<3loathing<3<3<3', 'I<3m', 'booked<3', 'Well', 'if', 'I', 'bump', 'the', 'loathing', 'to', 'to', 'nine', 'I', 'could', 'still', 'be', 'done', 'in', 'time', 'to', 'lay', 'in', 'bed', 'and', 'stare', 'at', 'the', 'ceiling', 'and', 'slip', 'slowly', 'into', 'madness<3<3<3But', 'what', 'would', 'I', 'wear', '<3<3', '']

Lastly, I renamed my spilt string to strings2 and then created a function called count_chars() to count how many of each word in my string I had which is shown below.

strings2=re.split("\\s+", strings1)
sentence = "Wow, this is super crazy."
counts = {}
for c in sentence:
    if c in counts:
        counts[c]+=1
    else:
        counts[c]=1
def count_chars(sentence):
    counts = {}
    for c in sentence:
        if c in counts:
            counts[c]+=1
        else:
            counts[c]=1
    for c in counts:
        print(c, "appears", counts[c],"times")
count_chars(strings2)
## Christmas appears 1 times
## is appears 4 times
## my appears 6 times
## favorite appears 3 times
## holiday appears 1 times
## of appears 5 times
## the appears 8 times
## year<3 appears 1 times
## I appears 7 times
## like appears 1 times
## having appears 1 times
## entire appears 1 times
## family appears 1 times
## and appears 5 times
## friends appears 1 times
## over<3 appears 1 times
## Decorating appears 1 times
## for appears 1 times
## christmas appears 3 times
## baking appears 1 times
## goodies appears 1 times
## some appears 1 times
## things appears 1 times
## to appears 5 times
## do<3 appears 1 times
## Watching appears 1 times
## cheesy appears 1 times
## movies appears 2 times
## always appears 1 times
## a appears 1 times
## blast<3 appears 1 times
## would appears 2 times
## say appears 1 times
## one appears 1 times
## <3How appears 1 times
## Grinch appears 3 times
## Stole appears 1 times
## Christmas<3 appears 1 times
## <32000<3 appears 1 times
## with appears 3 times
## Jim appears 1 times
## Carrey appears 1 times
## as appears 1 times
## Polar appears 1 times
## Express<3 appears 1 times
## My appears 1 times
## Favorite appears 1 times
## line<3scene appears 1 times
## is<3 appears 1 times
## The appears 1 times
## nerve appears 1 times
## those appears 1 times
## Whos<3 appears 1 times
## Inviting appears 1 times
## me appears 1 times
## done appears 2 times
## there<3 appears 1 times
## on appears 1 times
## such appears 1 times
## short appears 1 times
## notice<3 appears 1 times
## Even appears 1 times
## if appears 2 times
## wanted appears 1 times
## go appears 1 times
## SCHEDULE appears 1 times
## Wouldn<3t appears 1 times
## allow appears 1 times
## it<3 appears 1 times
## One appears 1 times
## o<3clock<3 appears 3 times
## Wallow appears 1 times
## in appears 3 times
## self appears 1 times
## pity<3 appears 1 times
## Four appears 1 times
## thirty<3 appears 3 times
## Stare appears 1 times
## into appears 2 times
## abyss<3 appears 1 times
## Five appears 2 times
## Solve appears 1 times
## world appears 1 times
## hunger<3 appears 1 times
## Tell appears 1 times
## no appears 1 times
## one<3 appears 1 times
## Jazz<3ercise<3Six appears 1 times
## dinner appears 1 times
## me<3 appears 1 times
## can<3t appears 1 times
## cancle appears 1 times
## that appears 1 times
## again<3 appears 1 times
## Seven appears 1 times
## wrestle appears 1 times
## self<3loathing<3<3<3 appears 1 times
## I<3m appears 1 times
## booked<3 appears 1 times
## Well appears 1 times
## bump appears 1 times
## loathing appears 1 times
## nine appears 1 times
## could appears 1 times
## still appears 1 times
## be appears 1 times
## time appears 1 times
## lay appears 1 times
## bed appears 1 times
## stare appears 1 times
## at appears 1 times
## ceiling appears 1 times
## slip appears 1 times
## slowly appears 1 times
## madness<3<3<3But appears 1 times
## what appears 1 times
## wear appears 1 times
## <3<3 appears 1 times
##  appears 1 times

Honestly, I struggled with alot with creating functions in python and google was my best friend lol. I think the key thing to remember is having to create a simple equation/filler like the sentence to make sure when doing count and definning my function/constants, it would work because I know I didn't do this correctly the first time!

Here's a meme:



comments powered by Disqus