Being female is great! And DevBootcamp is a great place to be female. Rarely have I been in places where it’s difficult to be female, specifically, but this environment is significantly above average in terms of gender issue awareness. We have open discussions about gender and people seem to embrace their stereotypically opposite-gender traits more often than I see in the rest of the world. Early on, our instructor shared a self-aware story about how he noticed himself falling into some gender stereotypes at a previous workplace – this meant a lot to me to hear from a male perspective.

There are some ways that I still see gender as an issue. 

In this kind of learning environment, it is apparent that men are often more willing to mention concepts confidently in passing, even if they don’t really understand the concept. There have been so many situations with a number of guys where the conversation goes like this:

Guy: “Oh yes, blah blah technology x is really good for that blah blah”

Me: “What is technology x?”

Guy: “I’m not quite sure, it’s just something I heard someone else talking about.”

It’s weird to me, because rarely would I mention something with an opinion when I know I don’t understand the topic. I don’t think it’s the fault of men (or anyone who does this), it’s really just that our society hasn’t come to terms with how to discuss uncertainty and puts pressure on people to feign confidence. More on that some other time.

On a related note, men are also more often seen as people who can answer one-time questions, probably because they tend to give more assertive answers than I would, even if I have the same amount of knowledge, whereas I tend to answer with what I know and then also mention what I’m not sure of that might be useful to look into.

My response to gender awareness does make me do some weird stuff.

I am inclined to not do “feminine” stuff – like be the person who cleans up the kitchen for other people or talk about how much I love expensive shoes. I’ve also developed a reputation for hating CSS and front end, which is true, I do hate the feeling of moving boxes around (I am actually okay with the instances where I learn nuances about how elements tend to behave and why they’re designed that way). However, I think I’ve also been more vocal about it than I normally would be towards something I mostly find boring. If I’m honest with myself, this might be because I don’t want women in general to be seen as leaning towards softer design aspects of programming rather than data-heavy back end stuff. But on the other hand, it might be feeding into stereotypes of women being overly emotional; I don’t believe anyone would count me as being overly emotional, though, so it doesn’t feel like a risk to me.

This doesn’t apply to all men, but it is surprising how little men think about women’s issues. I guess it is reasonable, but at the same time, I’m pretty sure I think about poverty, race issues, LGBT issues much more than men here seem to have thought about women’s issues. However, since I’m Asian-American and female, it might be that I’ve been pushed to think more about minority issues in general.

Anyway, on the other hand, men here are absolutely willing to talk about women’s issues. I’ve had conversations with several people where i spout my thinking of “women’s issues are men’s issues don’t you want men to be able to play with dolls if they want to.” I’ve found this to be a great thing about how men are raised in the American society (note – not a way that men ARE, just how they are usually trained by society) – they’re happy to have heated arguments with you on whatever touchy subject where both parties say politically incorrect things and will not hesitate to call you out on any details, but there’s no animosity outside of that conversation.

Conclusion – being female. It’s good. It’s good here.

I usually read on the train, but the past few days I’ve been slightly sick and haven’t had the energy. Instead, I’ve been sketching on my mp3 player and I’ve been thinking about how deeply drawing has enriched my life, even though I’ve had minimal training in visual art and I only do it occasionally. (And yes, I use a Samsung mp3 player. I have complex taste in music and a very dumb phone.)

I feel proud of a few of my drawings, but it’s overwhelmingly about the process of seeing and feeling your surroundings with more clarity, appreciating tangles of lines by seeing them as individual threads and trying to translate indescribable textures through the mundane magic of graphite (/a magical mobile device with a drawing app).

Drawing is the best way to see beauty in everyday things. I have never drawn anything and decided “ugh, upon closer inspection, this thing is ugly” – drawing always transforms the dullest entities into rich, ravishing visions. I feel like self-portraiture could be good therapy for anyone experiencing body image issues – I sometimes feel like my hands are stubby and asymmetrical, but drawing them always makes me a little happier with them.

Some thoughts on how to get started drawing:

  • Anyone who can write legibly has enough coordination to draw. The main problem is seeing things as shapes, shades, and lines, rather than as symbols (e.g. representing a person as a stick figure).
  • How to avoid seeing symbols? Start with drawing something wrinkly and unrecognizable and try to replicate visible lines, but don’t bother looking at your drawing.
  • When looking at real objects, look at small parts. Follow a shirt collar instead of the shape of someone’s body, or your thumbnail instead of trying to draw a hand (not a pun).
  • At some point you’ll convince yourself to see things as they appear (rather than as they are, oddly), and then it might make more sense to plot out drawings by large shapes first.

Questions I have and may write more about later:

  • What are the ethics of drawing people on the subway when they don’t know they’re being drawn?
  • Should I get one of those Wacom tablets?
  • Should I pitch some kind of learning-to-draw app for our final projects tomorrow instead of whatever I do end up pitching?

I’ve always adored literature as much as spreadsheets, so it makes sense that I started wondering about natural language soon after I started at DBC. Regretfully I haven’t made much progress beyond wondering, but I’m slated to give a briefly ‘lightning talk’ on something tomorrow, so I figured now is the time to summarize what I’ve gathered so far about this topic.

What is natural language processing (NLP) ?

NLP is a field of computer science that considers human language and how computers can interact with it. This includes relatively simple things like describing human-generated text in terms of frequency distributions, to very complex things like extracting meaning from texts or generating human-like language.

Incidentally it’s interesting to note that google trends suggests “natural language” is actually less popular of a topic now than it was in 2005; that’s interesting – I wonder if it’s now branched out too far for the general term to be used often.

What tools are easily accessible to us (i.e. people who recently started programming, primarily in ruby) for processing natural language?

Ruby Treat

I figure I should mention this first since it’s a Ruby gem. I haven’t tried it yet, but it seems to have basic functions that are similar to python’s NLTK. Treat does things like tokenizing, stemming, parsing groups of words into syntactic trees (more detail on that later).


AlchemyAPI – a company that provides text-analysis services; a few groups have used this for final projects, since they do some high-level language processing for you instead of you having to write your own algorithms (I guess this could be crazy in the context of a week-long project). They have a nice “getting started” guide for developers with examples of what they can do, including:

  • Entity extraction, keyword extraction – finding the subjects of sentences or larger pieces of text
  • Relation extraction – within sentences, isolating subject, action, object
  • Sentiment analysis – providing a numerical value on whether context around specific words is positive or negative
  • Language detection
  • Taxonomy – grouping articles into topics like politics, gardening, education, etc.

Semantria – seems to be comparable to Alchemy in that they also have an API that allows developers to request sentiment analysis for pieces of text; from a glance their marketing seems to be more directed towards twitter/social media.


 

Python’s NLTK is a well known library for natural language processing, and python is relatively similar to ruby as a programming language. The NLTK introductory book is easy to read and simultaneously provides an introduction to python. The basic concepts are easy to understand, but they quickly develop into sophisticated problems that remain issues in academic research. Some important concepts/vocabulary words below… in order of the book’s mentions, which follows tasks that are basic and doable with simple built-in methods, to concepts that require writing functions and large sets of data to provide meaningful results.

  • Tokenizing – splitting text into character groups that are useful. Often these are words, but I think it’s interesting how a word like “didn’t” could be tokenized into “did” and “n’t”
  • Frequency distributions are often used – frequencies of words, phrases, parts of speech, verb tense – these are all ways that different types of texts can be categorized. For example,
  • Corpora – these are large bodies of text data that may have some structure to make processing easier. The Brown Corpus is a famous one that includes texts from a variety of sources (religion, humor, news, hobbies, etc.), compiled in the 1960s, and there are many others – e.g. web chat logs, things in other languages
  • Other resources include things like dictionaries, pronunciation guides, WordNet is a “concept hierarchy” that has grouped words like frog and toad descending from amphibian
  • Stemming and lemma – stemming a word like  “running” would result in its basic form/lemma “run”
  • Word segmentation – how to split up tokens when boundaries are not clear, e.g. with spoken language or languages where written text does not have grouping boundaries
  • Tagging – parts of speech often used to categorize words, with more POS than we normally consider in English
  • N-gram tagging – deciding on tags using context, e.g. when considering the probabilistic tag for word #5, consider words #1-4’s tags
  • Classifying texts – this is a big subject with a lot to consider – depending on what you want to classify, what features can be isolated by a computer program? How to judge accuracy? “Entropy” and information gain – how much more accurately can we classify texts with the addition of a new feature?
  • naive Bayes classifiers – classifies text based on individual features and deciding to move closer to/farther from potential classifications with each piece of information; naive refers to considering all features independent
  • Chunking – segments sentences into groups of multiple tokens, e.g. grabbing a noun phrase like “the first item on the news.” Chunking tools generally are built on a corpus that has a large section of training data, where text has grouped into the right chunks. The patterns of chunking in the training text informs the tool’s categorizing going forward.
  • Processing grammar and ways to translate written information into forms that computers can easily process for querying (this gets into the realm of IBM Watson)

What are some potentially fun beginner projects to do with natural language processing?

So I haven’t done any of these yet; up until last week I was still struggling just to get python and nltk running on ubuntu and being able to download corpora. However, here are a few things that I think might be fun and not too difficult to make, some of which I’ve discussed before…

  • What author are you? Take a sample of your writing and compare it against books available from the Gutenberg corpus
  • Portmanteau-ifier – find a dictionary of root words and supply suggestions of good portmanteaus when given 2+ words
  • Spam vs. not spam email, mean vs. not mean comments
  • Rhyming poetry generation

Well, it’s been a long week here at DBC. We learned rails over the weekend (not to mention I put up this blog – still proud of that!) and started a 5-day group project on Wednesday. My group is working on a rommate expense-sharing application modeled off my spreadsheet from back when I tracked expenses for the townhouse on the Upper West Side. It’s been wonderful working on something that could have real applications rather than toy projects – at some point I will write about how I feel about creating games while learning programming.

The roommate application fits under the umbrella of “things that could be good for the world” because I believe increased communal living among adults and nuclear families could be a wonderful thing for western society. Many people suffer from loneliness that partially results from not having a nuclear family or from being isolated to only their nuclear family on a daily basis, which could be alleviated if people belonged to larger, loosely affiliated groups that share spaces and responsibilities. My dream is to someday convince a bunch of my friends to take over a group of adjacent residences and raise children together as a group, cook and eat dinners as a group, etc. It could be highly efficient and beneficial to overall mental health.

I’ve been putting off giving a lightning talk on some sort of technical topic, which is required of us this week or next. I dislike the idea of looking into something purely to explain it to a group, so I’m sort of hoping that I naturally find the inclination to look into something this weekend. The things I’ve been thinking about throughout this program – natural language processing, statistics, image processing – perhaps one of those.

Other things on my mind – Chicago is a little warmer this week, had a beautiful moment of clarity in a sit spin attempt this morning (otherwise very wobbly on ice), and very hungry. Wondering how I’m perceived by my peers here, wondering whether we will keep in touch after the program.

Literally! I missed my stop on the #80 bus and found myself facing a wrought iron fence that blocked the alley that eventually opened into my street.

Also figuratively! I finally (finally!) set up an acceptable wordpress theme for this personal site and experienced a number of wordpress revelations heavily assisted by Ryan Bahniuk.

I meant to post a wordy update two months ago as I was moving to Chicago. I never wrote this update, so the bulleted Q&A version is below:

  • Why did you move?
    • I decided to quit my job in equity research and partake in DevBootcamp Chicago, a 9-week program that teaches people to be web developers.
  • Why did you decide to quit your job?
    • I think most of you I’ve talked to in person recently (i.e. within the last year) know that I genuinely liked my job and particularly liked the people there. And more of you probably know or could infer that I LOVE excel spreadsheets and quantifying things in general. But ultimately I feel that large-cap equity investing lacks purpose on an individual level (even though it is highly meaningful at the global scale), and the pace of the work wasn’t something I felt keen to sustain.
  • Why this web developer thing?
    • I feel the need to clarify that I’m not a closet computer geek nor do I have ambitions to found the next overvalued tech IPO. I like that programming has becoming a cheap platform for normal people to create useful things, and I like working with logic. I also like learning new things, and paying $12,000 to quickly learn the basic skills to launch an entirely new career sounded like a remarkably efficient use of time and money.
  • How is Chicago vs. New York?
    • Well, I should caveat this statement since it’s only September and Chicago has had wonderful weather the last two months. But, so far, Chicago beats New York soundly. I like New York and the experiences it has to offer, but the experiences that I value most can be found in basically any major city, although here they cost less and are less crowded. The one exception is keeping up figure skating – the skating culture is much bigger here, and with regional competitions coming up, the rink is now regularly filled with young counterclockwise skaters who are much better than me.
  • How is the program going?
    • Not bad! It’s a decent amount of work, but I’m pretty sure I ironically spend less time staring at a computer screen than when I was working. I think I had a bit of a head start compared to the average person who starts here due to the engineering major and my general math-iness, so it hasn’t been as stressful as I had been warned.  The people here are wonderful (I would have expected nothing less, since they are mostly midwestern), and I’ve been extremely impressed with how much work everyone is putting in and how much we’ve already learned in 6 weeks. The program has a strong and positive culture; I chose to come here partially because I wanted to attend yoga classes and learn about “engineering empathy” while also learning about programming, and I’m happy to report those expectations have been met and exceeded!
  • What will you do when you’re done in early October?
    • Not sure yet. My lease ends Jan 31, but I may still try to find a job in Austin or Portland (i.e. somewhere temperate) before winter. I plan to keep figure skating and to pick up learning piano/voice/other music stuff again. I plan to work on some programming-related projects, and probably visit the east coast sometime in November.

Well, that was much wordier than expected. I hope all of you east coast people (and anyone else) are doing well and haven’t forgotten about me!