Posts

  • The Research Tax

    Written quickly. Close to stream-of-consciousness.

    If you haven’t kept up with recent news in the intersection between academia and politics, here is the short version: the currently debated GOP tax bill significantly increases the tax burden on graduate students, and it just passed the House.

    This has been covered by Scott Aaronson, Luca Trevisan, and others. The below is a summary of information from there. Skip to the next section if you’d just like my opinion.

    Currently, university’s handle PhD student tuition like this.

    • The graduate student pays \(\$X\) as tuition.
    • The university waives \(\$X\) of the tuition.
    • The university then pays a graduate student stipend of \(\$S\).
    • In the current system, stipend \(\$S\) is taxed and the waived tuition \(\$X\) is not. The student only ever receives \(\$S\) - the \(\$X\) is essentially invisible.

    Under the new GOP tax bill, the waived tuition \(\$X\) will be taxed. This is a double whammy, since not only does it increase the total amount of taxable income, the increase is enough to push several students into the next federal tax bracket. A more detailed breakdown can be found here. The linked analysis shows that if nothing else changes, a typical in-state Berkeley PhD student would pay about $1400 more tax, and a typical MIT PhD student would pay about $9500 more tax. These are rough orders of magnitude for how it affects public universities vs private universities, with more damage to private universities because they have a higher tuition. Importantly, students would not get taxed because of a larger stipend - they would be taxed on money that has never entered their pockets in the first place.

    As for why universities can’t simply declare that grad student tuitions are \(\$0\) - there’s some accounting trick that lets the university get more money if they give tuition waivers to grad students. I haven’t looked into the details of this.

    * * *

    I’m currently not in academia, for several reasons, but the big one is that I got a job offer from an industry research lab with interests close to mine. I’m certainly giving up some things, but the trade-offs fall in favor of me staying out of academia.

    If I had gone to academia, I would have been okay financially, thanks to several lucky breaks. I was born in an upper-middle class family, the kind that doesn’t spend a lot of money but has money to burn. I had natural interest in math and computer science, and turns out the world’s willing to pay those people quite a bit if they enter finance or software. I liked algorithms, which happened to be the weird test of software engineering prowess in Silicon Valley - the only reason I got my first internship was because I knew the pseudocode for Dijkstra’s algorithm. And although part of my heart will always belong to the beauty of proofs, I tolerated systems enough to pick up the skills that let me handle industry.

    Overall, I’ve lived a privileged life. That likely wouldn’t change in academia because CS PhDs have it easier than other departments. With careful spending, I think I could intern at a tech company some months of the year, use the money from that to fund research for the rest of the year, and still end net positive.

    The thing is, these policies aren’t crippling to people like me. They’re crippling to the less fortunate.

    I can’t speak for other fields, but academia for CS is increasingly a rich person’s game. Any strong PhD candidate could be at least an average software engineer, and that’s a lot of money to leave off the table. I’ve read an anecdotal story of a promising research, first to go to college in her family, and she laughed at the thought of going for a PhD. Her parents had done so much to support her. It would have been too selfish to turn down a well-paying job that could let her start paying them back.

    Across all fields, the tax bill would essentially do the same: make academia more of a rich person’s game. The reason the news bothers me so much is that if it goes through, there’s going to be so much unfulfilled passion, so many students who can’t let their research interests override financial realities. It’s a duller, less colorful world.

    * * *

    To play devil’s advocate, the analysis above assumes nothing else about the world will change. This is very unrealistic. If the tax bill goes through as is, universities will certainly adjust - ask for more donations, decrease tuition, and make up the accounting shortfall elsewhere with even more creative costcutting. The actual tax increase would likely be lower than the current numbers.

    However, I have a hard time believing that universities will be able to make up all the difference. Universities certainly have bloat, and a reduced budget provides a very strong motivation to identify that bloat - but based on what I’ve heard about university financials, I’m not convinced there’s a lot that can be cut without a fight. There are some damning numbers showing that administrators are taking up an increasingly large share of university budgets, but I’d guess that you can’t just layoff a ton of admins and expect the university to put itself back together in a reasonable timeframe.

    The top-tier universities can weather this better. The lower-tier universities, less so. It’s the same rich person’s game - universities that already have trouble with recruiting grad students will have even more trouble recruiting grad students. The conclusion is similarly disappointing.

    * * *

    Throughout this post, I’ve been assuming academia is intrinsically valuable. That’s certainly up for debate. One argument I’ve seen is that outside of the top-tier universities, academia is a net-negative pursuit, and it would be better for society if lower-tier schools were priced out of relevance. Given the latent misery and stress of academia, and the constant self-doubt researchers have about the relevance of their own work, I think it’s worth considering this argument seriously. However, debating the merits of academia is out of the scope of this post.

    To funnel everything back into RL terms (since I’m “that RL guy”) - I see academia as the ultimate extreme of the exploration-exploitation tradeoff. Industry is content to do what works, industry research labs can be more exploratory, and academia gets to consider crazy ideas that may not be relevant for decades. In my ideal society, there are always people taking crazy ideas seriously. And I mean that in a good way! Some nuts talking about water memory, some other people trying to quantify the odds we’re living in a simulation, a third group advocating that we spend the next 50 years building a model of all of ethics. The strength of academia (and the argument for tenure) is that it lets you do these things if you care about them enough.

    Somewhere out in the world is a cohort of Medieval Studies PhDs, and I feel very safe saying that little of note will come from there in the next 25 years. But that doesn’t mean I want them to disappear. Do you know how insane you have to be to want to do Medieval Studies? Like, holy shit, you really really really really have to like the subject to want to spend your life doing that. How is that not crazy awesome?

    The world should have room for people like that. I’m worried it won’t.

    Comments
  • My First First-Author Paper

    Images of a simulated robot, real robot, and simulated images made to look realistic

    Today, the paper Using Simulation and Domain Adaptation to Improve Efficiency of Deep Robotic Grasping went public on arXiv. I’ve been working on this project for the past 6 months, and this is the first time I’ve been first author on a research paper, so overall it’s a Pretty Big Deal for me. I share first authorship with Konstantinos Bousmalis and Paul Wohlhart, both of whom were great to work with. The link above will take you to a landing page with the arXiv link and a video that briefly explains our work.

    If it wasn’t clear from the lengthy author list, a lot of people contributed to this project. Thanks to everyone for the mentorship and the engineering support that made this project possible.

    Comments
  • The Friendship Paradox And You

    The friendship paradox is a cute rule of thumb. Unlike other rules of thumb, it actually has some mathematical justification behind it.

    The paradox states that on average, your friends have more friends than you do. At first glance, this may seem strange, because it can’t be true for everybody. Someone has to be more popular than everybody else. And that’s true - somebody has to be on top. That’s why the statement says on average. A small fraction of people are more popular than their friends, and a large fraction are less popular than their friends.

    To justify why this could be true, let’s model friendship as an undirected graph. In this graph, people are vertices, and an edge connects two people if they’re friends with one another.

    Friendship Graph

    Now let’s introduce some notation. \(V\) is the set of all vertices, \(n\) is the number of vertices, \(v\) is a single vertex, and \(d_v\) is the degree of vertex \(v\). The average number of friends a person has is

    \[\text{Average number of friends} = \frac{\sum_{v \in V} d_v}{n}\]

    Alright. But what’s the average number of friends that someone’s friends have? To count this, let’s imagine that every person \(v\) creates \(d_v\) lists, one for each of their friends. Every list is titled with that friend’s name. Say their friend is named \(u\), for sake of example. On that list, \(v\) writes down all of \(u\)’s friends.

    Friendship Graph With Lists

    The average number of friends that \(v\)’s friends have is the average length of the lists that \(v\) created, not counting the title. The average number of friends that someone’s friends have is the average length of all these lists.

    Each \(v\) creates \(d_v\) lists, giving a total of \(\sum_{v \in V} d_v\) lists. There are \(d_v\) lists titled \(v\), because we get one such list whenever a friend of \(v\) creates lists. Each of those lists has \(d_v\) names on it. Thus, each person \(v\) contributes \(d_v^2\) to the total length.

    Overall, this gives

    \[\text{Average number of friends of friends} = \frac{\sum_{v \in V} d_v^2}{\sum_{v \in V} d_v}\]

    Now, apply the Cauchy-Schwarz inequality. This inequality states that for any two vectors \(a\) and \(b\), their dot product is at most the product of their norms. We’ll use the version where we square both sides.

    \[\langle a, b \rangle^2 \le \|a\|^2\|b\|^2\]

    Let \(a\) be the vector of all ones, and \(b\) be the vector of degrees \(d_v\). Since there are \(n\) vertices, we get

    \[\left(\sum_{v \in V} d_v\right)^2 \le n \sum_{v \in V} d_v^2\]

    which rearranges to

    \[\frac{\sum_{v \in V} d_v}{n} \le \frac{\sum_{v \in V} d_v^2}{\sum_{v \in V} d_v}\]

    The left hand side is the average number of friends, and the right hand side is the average number of friends of friends. That concludes the proof. \(\blacksquare\)

    At a high level, the friendship paradox happens because the popularity of popular people spreads through the network - they have lots of friends, each of whom sees that some of their friends are popular.

    Importantly, this only says something about the average. Arguing anything more requires making assumptions about how people interact and how friendships work.

    A natural extension is to ask whether a similar result holds in directed graphs. A lot of relationships aren’t symmetric, so if a similar result holds, it makes the principle more applicable.

    It turns out such a result does exist. Let’s model a directed edge as a producer-consumer relationship. There’s an edge from \(u\) to \(v\) if \(u\) produces something that \(v\) consumes.

    Directed Graph

    People both produce things and consume things, represented by out-edges and in-edges respectively. Let \(d_{v,out}\) and \(d_{v,in}\) be the number of out-edges and in-edges for \(v\).

    Let’s consider the average number of outgoing edges. This is the average number of things that people produce.

    \[\frac{\sum_{v \in V} d_{v,out}}{n}\]

    For a given \(v\), let’s compare this to the number of things produced by content producers that \(v\) follows. For each incoming edge, create a list for the source of that edge, writing down every consumer of that source. (This list will be the endpoint of every outgoing edge from that source.)

    Directed Graph List

    From here we can apply similar logic. Each person creates one list for each outgoing edge, giving \(\sum_v d_{v,out}\) lists total. Each \(v\) is the title of \(d_{v,out}\) lists, one for each consumer. Each of those lists will have \(d_{v,out}\) items on it. All together, the average across all \(v\) is

    \[\frac{\sum_{v \in V} d_{v,out}^2}{\sum_{v \in V} d_{v,out}}\]

    which we can once again apply Cauchy-Schwarz too. The conclusion?

    On average, the content producers you follow make more things than you do.

    I call this the producer view, because you’re always counting the edges that leave each vertex. We can also take the consumer view, counting the edges that are entering each vertex instead. By performing a similar argument, you get this conclusion instead.

    On average, the people who follow your work follow more things than you do.

    Both views are valid, and give different interpretations of the same graph.

    Again, this argument only says something about the average, and you need assumptions about graph connectivity to argue anything stronger. In fact, despite its mathematical underpinnings, I would hesitate on treating the friendship paradox as a truth about the world. I see it more like a principle, that’s useful for flavoring different arguments, but not strong enough to hold an argument on its own.

    * * *

    In the derivation above, the only requirement was that we could model interactions as a graph.

    There’s a branch of mathematics called category theory. I don’t know it very well, but the impression I get is that you let objects represent something, you let arrows represent some relation between objects, you draw arrows between different objects, and then you interpret all of mathematics as special cases of those objects and arrows. This lets you do things like explain finance by drawing a bunch of arrows.

    For some reason I know \(\epsilon > 0\) fans of category theory are going to read this post, so as an homage to them, let’s make a bunch of wild claims about society by generating different interpretations of vertices and edges.

    Let vertices be Twitter accounts. An edge connects \(u\) to \(v\) if \(u\) follows \(v\). In the producer view, on average the accounts you follow have more followers than you. In the consumer view, the people who follow you are more likely to follow more people than you do.

    Let vertices be people. Instead of friendship, say there’s an edge from \(u\) to \(v\) if \(u\) has a crush on \(v\). To the disappointment of many people, crushes aren’t symmetric. In the producer view, on average the people you crush on have more admirers than you do. In the consumer view, on average people who have crushes on you have crushes on more people than you do. I don’t know if this makes anyone feel better about their love life, but there you go?

    Again, let vertices be people. This time, there is an edge from \(u\) to \(v\) if \(u\) writes something that \(v\) reads. In the producer view, on average your readership is smaller than the readerships of writers you follow. In the consumer view, on average your readership reads more things than you do. Now, not everybody writes, but we could substitue writing with any form of communication. Blogs, articles, Facebook posts, speeches, Youtube videos, research papers, memes…

    In general, any prolific person not only makes lots of things, they become well-known for making lots of things. Their reputation both precedes them and outgrows them.

    * * *

    I like the friendship paradox a lot. Why?

    Well, for one, it’s great for addressing imposter syndrome issues. For example, sometimes I feel like I should be writing more. When I poke at the feeling, it often turns into this.

    1. I should write more.
    2. Why do I think that? It’s partly because I read cool things from people that write more than I do.
    3. But by friendship paradox, it’s expected that those writers are writing more than me.
    4. So hey, maybe I shouldn’t feel too bad.

    More importantly, the friendship paradox touches on another, more important idea: the things you see don’t have to reflect reality. If you base your assumptions of popularity on the popularity of your friends, on average you’ll come up short. If you base your assumptions of productivity by things you read online, it’s easier to see evidence of productivity from people who are very productive. If you base your views of somebody by things you hear them say, it’s warped by the chances you would have heard their views in the first place. And so on down.

    Not all of these are applications of the friendship paradox, but it’s easy to forget about these things, and thinking about the paradox is a nice reminder.

    Comments