2023年9月

“辛勤地工作一生然后富有地死去。”我们先不讨论能不能工作一生,不要问我蛐蛐一个intern这么严肃的问题,富有不富有更是不知道了,我只知道一麻确实是让我有点怀疑人生。那我也不演了,我其实是个小留,大不了啃老嘛,我凭本事投的胎,好像没什么问题。

难免就想起《勿以恶小》里猎魔人的两者皆否,没有面临类似的情景真的很难理解杰洛特的选择有多么难做,作恶的诱惑太大了。那就暂时选择the less evil.

'Evil is evil, Stregobor,' said the wither seriously as he got up. 'Lessser, greater, middling, it's all the same. Proportions are negotiated, boundaries blurred. I'm not a pious hermit, I haven't done only good in my life. But if I'm to choose between one evil and another, then I prefer not to choose at all. Time for me to go.'

治不了amazon开游戏进arasaka大楼用QianT Sandevistan Mk.4把亚当重锤砍爆还是可以的。

Recollection

Nikon Z fc - NIKKOR Z DX 18-140mm f/3.5-6.3 VR | RICOH GR III
hsbc
诋毁小留,理解小留,变回小留
the modern bartender
The modern bartender (手抖了...)
moonlander
Moonlander - 她一开始很糟糕,但是你觉得你会喜欢上她
Breka Bakery & Café
Breka Bakery & Café - Doughnut
Breka Bakery & Café - Cookies
Breka Bakery & Café - Cookies

W-8BEN

W-8BEN能这样填就这样填,10年后记得回来谢我。

CAPM, Alpha, Beta

大概扫了一下CAPM以及所谓的alpha和beta,Fama–French还能算讲道理,Barra Risk Factor是个什么玩意,1年10个米你怎么不去抢?脑子里的自动控制理论让我对“找alpha”本身有着本能的怀疑,所以你们玩吧,我all in beta了。

不过“找alpha”为的又是什么呢?我更想知道的是在限度下自己到底想要什么样的生活。

Discipline Is Actually An Emotion

https://www.youtube.com/watch?v=0N0LV0mqTYQ

没有找到相关实验支撑,用得故事,所以本质上是'From a doctor licensed in the US'的'trust me bro'.

不过个人经历来说确实可以这么解释:不够自律的原因是对所要干的事情的「怀疑」,怀疑的反面情绪是「决心」,所以如果要变得自律,首先应该做的不是追求自律,而是充满决心。

我还是要提醒一下各位,有决心不代表人要当傻逼,少年jump看多了是吧?又要有决心又要不当傻逼是很难的,当傻逼那不叫自律,那叫贪嗔痴,自律因此才显得弥足珍贵。

西蒙娜·薇依

我认为你的性格决定了你一生注定要承受很多痛苦。我甚至确信这一点。你太过热情和冲动,永远无法适应我们时代的社会生活。你并不是唯一一个。受苦并不重要,因为你同时也会获得巨大的快乐。重要的是,不要虚度你的生活。为此,你必须约束自己。

Je crois que vous avez un caractère qui vous condamne à souffrir beaucoup toute votre vie. J'en suis même sûre. Vous avez trop d'ardeur et trop d'impétuosité pour pouvoir jamais vous adapter à la vie sociale de notre époque. Vous n'êtes pas seule ainsi. Mais souffrir, cela n'a pas d'importance, d'autant que vous éprouverez aussi de vives joies. Ce qui importe, c'est de ne pas rater sa vie. Or, pour ça, il faut se discipliner.

“哇,简直是我。”

对,你最热情,你最痛苦,没人能了解你深刻澎湃的内心,你咋这么自恋呢?前面的都可以删掉,受苦也可以删掉,这些句子,这些词,因为现在大多数人缺乏共同经验和生活连接显得用力过度,有着无法容忍的愚蠢和罪恶。

原文的ego自闭得无可救药,而评论的麻木也无法支持人的生活。不过这两种体验都值得去实践,目的只是为了随后将之放弃,然后我们会重新回到这个敞开大门的世界,思考如何不羞辱生命无限丰富的可能。

雨的气息

80前期出生。经历了整个acg文化的发展,你认为我会因为有充足的游戏经历而感到充实吗。恰恰相反。

因为在这40年中我所接触的游戏里,九成都完全没人再提起了。现在大家在B站看到的经典神作回顾,动辄有人千字留言的游戏,真的就是九百牛之一毛。

我们人说起来还是很可悲的,一个事情只有你自己知道,这件事对你的意义会变得暧昧不明。只有一件事大家都在讨论、分享、发酵,它才真的对我们具有实感,要具象的说就是“真实,热情,爱”——社会、生命,永远必须是“进行时”,什么什么ing,对我们非常重要。

https://www.zhihu.com/question/596914550/answer/3061515018

我在第一眼看到xeno-这个词根时感觉它击中了我的灵魂,或者说,是我灵魂的一部分。我一直以为自己从来不属于某个群体,是所谓的异乡人、"outsider".

但是很可惜自己见证过good old days时的赛博空间,一群人使用头像作为化身,在论坛贴吧里用文字绘画建立起一座座城市。或许我应该引用同时期我最喜欢的一篇中二小说里的句子来形容这种史诗般的印象:

此后,一个新兴的王国建立了,它的子民有人类、山民,也有精灵与矮人,圣堂的教义在此广为传播,但自然的信仰亦不为人所排斥。

这里是自由的乐土,金钱与刀剑之声常年交相辉映。

我相信进去过的人都会被它标记,就算它已一如眼泪,消逝在雨中。

我也和多人分享过我迟来的感情经历——或许是因我叙述的方式,但我想更多的是他们不曾进入这些城市——它们年老年少都想表达:「我爱上的是一个我构造的人,是我的投射」,甚至她也说过后半句话。

可是我对构造的否定确实在她说出投射之前。我很早就觉得自己是一个失去爱人能力的人,这种理想情况下不受血缘、社群和经济关系硬性绑定的个人选择1实在过于愚笨,之前的我根本无法理解。然而,那种就算沐浴冰水中仍心急如焚想要了解对方的烈焰,那种每天晚上都想见到刚刚离开的面孔和躯体的渴望,不是爱情又是什么呢?

难道我甚至在对自己在说谎?难道我确实是在构造她人?如果我说我有意或者无意地猜对或者猜错对方所想,都仅仅是为了让她能够稍微表露多一点点的内心想法,而且我无论接受或者拒绝都已做好准备——难道我不能在无言中自行了断而不是殷求她的恩赐解脱?

所以,现在回顾,不是哦,我不是在寻找一个人,我是喜欢上了一个活生生的人,只是她恰好隐隐约约有着那个时代实感,是雨的气息,我现在才理解。我想让她倾诉,让她表达,让她自信地分享她接触过的我应该都接触过的这看不见的城市中的诗或者画。就像在万古冬夜中相遇,停下,我架起篝火,她弹着吉他。

他曾经认为,这些动人心魄的激情,是他作为人的自由不可分割的一部分;现在,他却因为没有这些激情而感到自由。

接着给Yvonne写小作文

Hello Dr. Yvoone,

I have decided not to enroll in the Master Thesis course this fall semester. It's not a decision I've made willingly, but rather a necessary compromise: If I were a Canadian citizen or permanent resident, I would certainly take it. But, if I take the course but fail to publish the paper, I will lose my PGWP. This is a substantial risk that I believe no international student would willingly take. Again, I am stubborn but not stupid. getting a paper published this semester is a mission impossible even with the help from the WPI student I introduced to you earlier.

However, my desire to continue to be your disciple remains strong. I still want to work on the thesis and produce results (dreamily, on OSDI or PODC) that demonstrate my potential as a strong Ph.D. candidate in the field of Distributed Systems. I understand this may seem to be an imposter: if my passion for the field is genuine, why should the outcome matter? But I won't fall into this "you are not a real fan" prosecution -- I refuse to be drawn into the debate of authenticity. I am genuinely passionate about this field, and I want both: a deep understanding and perceivable results.

Therefore, I've decided to change the game plan for my graduation thesis (possibly an after-graduation thesis). I plan to work on this paper outside of an official course, yet I still wish for you to be my advisor, albeit unofficially. My only request is that I would like to schedule one-on-one meetings with you to discuss my ideas and approach to the thesis. I need a community. I don't want to fight alone. I assure you that I will have a prepared agenda for each meeting to respect and make the most of your valuable time.

After conducting a recent literature review, I've found that I'm not particularly interested in HPC. To me, it's essentially a distributed system with heterogeneous hardware, and the truly interesting part lies in the scheduler, which I don't think I can get permission to do experiments in the Discovery cluster. I will still maintain my focus on Distributed Systems, particularly in the context of Machine Learning tasks. A rough idea in my head is a comparative study of the performance of the same Machine Learning task executed on Ray and Spark, with a focus on dataflow.

Why and what's Ray? Let me continue my AWS log dive story with you. It makes my internship the second most fascinating that ever happened after I came to Canada (I already shared with you the most fascinating one with love and tears).

So for the 12 PetaBytes (not 16, at that time we got the number wrong) log dive, I was required to come up with a design doc within a single day. And I made it, using the MapReduce paradigm -- yes, the very one we discussed in our first class, where we officially met for the first time. I also asked Peter's opinions on using the Step Functions as the coordinator (master) for the whole process. Unfortunately, he didn't have the time to review my design documentation. The core concept was to have an Extract, Transform, Load (ETL) job, powered by a Spark Executor with 128 GB RAM and 32 vCPUs, handle one day's worth of data (roughly 3 TB). This would represent one map task. And the 12 PetaBytes data, according to my experiments, would spend Amazon ~8 million dollars.

To be honest, upon starting the design, I intuitively felt that a Spark Executor shouldn't be one map worker for such a task -- it's just a grep task, and The data were stored in JSON (that's why the S3 bucket took 12 petabytes and costs Amazon 300k per month) so there's no "dataframe" and all we need to do is to do a String Match, the KMP one. But the bottleneck for this task is S3's IO and it seems that there's no other tool I could use except Spark to run the grep efficiently to exploit the S3's throughput limit. They want results badly. So speed really matters here. No time for me to write the code from scratch that controls a bunch of EC2 instances, even using Elastic MapReduce.

Luckily, one day I got some help from a nice guy Zach. He is a Senior Big Data Architect and, on hearing my task. He relentlessly recommended I use Ray which has just been published in AWS Glue (the product hub for serverless data analysis and computation, Spark (Glue ETL) is also there).

"How could I have overlooked Ray?" I questioned myself during the meeting. After all, it's the framework behind Alpa. What a serendipitous connection! Ray is also another framework that handles intensive ML dataflow, especially Reinforcement Learning, invented by RISElab, the one that invented Spark. Ion Stoica is the Oppenheimer behind the two projects. In my real-world experience with Ray, it almost turns programming for distributed systems as easy as parallel programming. All the nuances in consistency, availability, and partition tolerance are hidden from users. They only need to write Python code. (I feel sorry it's not your favorite Rust.)

Zach also wrote me a demo for such a log dive. After I modified it according to our needs and did some fine-tuning, the cost for a scan was estimated at merely 500 dollars.

So from 8 million dollars to 500 dollars, that's a BIG win. But, just like my love story, reality is not a fairytale. Amazon ultimately decided to use an existing tool to do the scan, partially because Ray is not available in some of the regions, but mostly because they want the result badly right now and cannot take the risks.

As for me, my internship was extended until December, and I was assigned another project. Despite having the design documentation and a prototype using Ray ready, I have to put them on hold, most likely eternally.

Very sincerely yours,

Liao-Liao

Life's all about timing. You gotta know when to pull the trigger, when to go in for the kiss, and—most of all—when to make for the exit.

Footnotes

  1. https://www.zhihu.com/question/495734671/answer/3215572349