呼唤潮水哥, 你要的模型来啦!!!!

热血热胜红日光
楼主 (北美华人网)
Machine learning uncovers the most robust self-report predictors of relationship quality across 43 longitudinal couples studies
https://www.pnas.org/content/early/2020/07/21/1917036117

What predicts how happy people are with their romantic relationships? Relationship science—an interdisciplinary field spanning psychology, sociology, economics, family studies, and communication—has identified hundreds of variables that purportedly shape romantic relationship quality. The current project used machine learning to directly quantify and compare the predictive power of many such variables among 11,196 romantic couples. People’s own judgments about the relationship itself—such as how satisfied and committed they perceived their partners to be, and how appreciative they felt toward their partners—explained approximately 45% of their current satisfaction. The partner’s judgments did not add information, nor did either person’s personalities or traits. Furthermore, none of these variables could predict whose relationship quality would increase versus decrease over time.
Given the powerful implications of relationship quality for health and well-being, a central mission of relationship science is explaining why some romantic relationships thrive more than others. This large-scale project used machine learning (i.e., Random Forests) to 1) quantify the extent to which relationship quality is predictable and 2) identify which constructs reliably predict relationship quality. Across 43 dyadic longitudinal datasets from 29 laboratories, the top relationship-specific predictors of relationship quality were perceived-partner commitment, appreciation, sexual satisfaction, perceived-partner satisfaction, and conflict. The top individual-difference predictors were life satisfaction, negative affect, depression, attachment avoidance, and attachment anxiety. Overall, relationship-specific variables predicted up to 45% of variance at baseline, and up to 18% of variance at the end of each study. Individual differences also performed well (21% and 12%, respectively). Actor-reported variables (i.e., own relationship-specific and individual-difference variables) predicted two to four times more variance than partner-reported variables (i.e., the partner’s ratings on those variables). Importantly, individual differences and partner reports had no predictive effects beyond actor-reported relationship-specific variables alone. These findings imply that the sum of all individual differences and partner experiences exert their influence on relationship quality via a person’s own relationship-specific experiences, and effects due to moderation by individual differences and moderation by partner-reports may be quite small. Finally, relationship-quality change (i.e., increases or decreases in relationship quality over the course of a study) was largely unpredictable from any combination of self-report variables. This collective effort should guide future models of relationships.
热血热胜红日光
这篇文章的主作者正在Reddit上做AMA答疑 (I am a ... ask me anything you want)
有问题的同学们赶紧去问啊! https://www.reddit.com/r/IAmA/comments/i5dtml/im_dr_samantha_joel_my_team_and_i_use_ai_to/
热血热胜红日光
AMA中出现的比较好的问答:
Q:What would you say is the biggest takeaway for a couple based on the results of your study? And is there anything a single person should take from it while looking for a partner? If I'm understanding it correctly, it looks like a lot of the factors that lead to success are things it might not be easy to evaluate until you've actually been in a relationship with someone for a bit.
A: I think the biggest takeaway, to paraphrase my old friend and colleague Geoff MacDonald, is that the person you choose may not be as important as the relationship you build. As a culture, we put so much emphasis on choosing the right person. These results suggest that it’s really more important to be the right person. To create the conditions that will allow a relationship to flourish. In terms of your point about evaluations, this is something I’ve spent a lot of time thinking about myself. Can a relationship be objectively evaluated—are some partnerships inherently better than others--and if so, when do these objective criteria first come online? This is somewhere my students and I would really like to take our research next. We want to recruit couples in brand new relationships and study how they evaluate each other for compatibility and fit, and how those evaluations change as the relationship develops. We were supposed to launch the study in March, but it got stalled due to COVID. Hopefully soon we’ll be able to open the lab up again, and I’ll have some more concrete answers for you.
热血热胜红日光
回复 3楼热血热胜红日光的帖子
Q: Why do you think that it's so difficult to predict which relationships will work out well, and which won't? (whether using AI or not)
A:That’s a great question. I think when it comes to relationship quality and longevity, there are a lot of chaotic processes at work that make long-term prediction difficult. Stressors and life events that come up, idiosyncratic experiences that you might happen to have with your partner, other people who may enter or exit your life and who give you different perspectives and ways of thinking about the partnership, etc. So we can predict the aspects of the relationship that are stable, but they also change over time in unpredictable ways. I think that’s because the changes are largely driven by these kinds of environmental and contextual factors that are very difficult to measure, let alone predict.
热血热胜红日光
Q: Have you found that the partnerships need to have a similar understanding of what the commitment translates to? For example, putting equal effort into maintaining the home, or equal involvement with children. Do any of the studies collect information to confirm or deny the reliability of zodiac sign (eastern and western) compatibility? For participants who had a “type” they were attracted to while dating, did their significant other match that description?
A: This is one of the more interesting aspects of the findings, IMO – we did not find any evidence for any kind of partner matching predicting relationship quality. The algorithm we were using detects interactions. So if my traits and preferences match with your traits and preferences to predict relationship quality, we should have picked up on that. For example, if Andrea says she likes extraverted guys, and she’s happy with Tom because he’s an extraverted guy, we should have found that putting Andrea’s desired extraversion and Tom’s own extraversion into the same model would have predicted more variance than either on its own. But that’s not what happens. Combining both partner’s variables didn’t predict more variance than just one partner’s variables. So that goes against the idea of matching, similarity, having a type, etc. If there was any matching going on, it didn’t predict how happy people were with their partners.
t
tidewater
赞👍
s
shanshuipinglan
所以要玩养成了……
热血热胜红日光
支持潮水哥的重要证据:
Q: Very thought provoking. Have you been able to find evidence that predicts the relationship quality? And thank you for doing this AMA!
A: Relationship-specific variables did a great job of predicting relationship quality. Your own perceptions of the relationship--such as your own sexual satisfaction, how much conflict you think there is in the relationship, and how committed you think your partner is--predicted 45% of the variance in your own relationship quality, at the beginning of the study. These same variables also predicted 18% of relationship satisfaction at the end of the study. And in fact, no other variables added to that total variance explained. Not your traits, not your partner’s traits, and not your partner’s perceptions of the relationship. All of the effects were driven by own judgments about the relationship.
Q: So, basically, if one is in a relationship and they are making the point to perceive themselves as in a happy relationship, they will be. How much does it matter to the success of the relationship if one perceives themselves positively but the other does not?
A: That’s a great question. My team and I were surprised that the partner’s perceptions of the relationship predicted so much less variance than own perceptions. Own perceptions of the relationship predicted 45% of the variance in relationship quality, but the partner’s perceptions (measured with the exact same variables!) predicted only 15%. That difference suggests that there’s a pretty big discrepancy in those ratings--how you perceive the relationship is not necessarily how your partner perceives it. It’s not clear at this point what the implications of those discrepancies are, or where they come from, but that would be a great topic for future research. How can two people be in the same relationship, and disagree so much about what it’s like?
热血热胜红日光
回复 8楼热血热胜红日光的帖子
这个是给还没结婚, 正在找朋友的:
Q: I'm an extrovert and I've been intensely unhappy dating introverts. So this seems to go against my own experiences, because there's not enough in common between us to keep a relationship going, and I don't feel that they care about me enough to compromise (e.g. they agree to attend game night with me once a month vs weekly). A: I think this really highlights that self-selection problem I mentioned—your relationships with introverts may not last long enough to be included in a study like this, which means those data are not part of the results. That’s why I really want to see more data on fledgling relationships. I’d love to enroll you in a study at the point when you have just started dating an introvert, and ask you about your experiences over those few ephemeral weeks or months that the relationship lasts before it fizzles out. Those sorts of data are so difficult to collect but I think they’re a really important piece of the puzzle. Q: Well I've been with my introverted husband for nine years. We've decided just recently that separation is probably the best course of action in the future (neither of us want to make such a large decision right now, in the midst of the world being on pause and both of us being depressed about it). A: I'm really sorry to hear that, Transplanted_Cactus.
t
tidewater
这个是不是 neural network supervised learning multi-class classifier, input is feature vectors?
Feature vector 不一定总是有很好的 pattern,因为 noise 的存在。Loss graph 有时候下不去。一万个数据点也不一定够大。
另外有时候需要人工 feature cross 。
不过目的是 reveal correlation 而不是真的要 build classifier 就没有那么多要求。
热血热胜红日光
这个是给马工和统计学家的
Q: Thanks for doing this Dr. Joel! Very interesting research. What made you think machine learning would be a good way to study the success of romantic relationships?
A: Well, traditional statistical methods that we use in this field—like regression and multilevel modelling--are really great for delving into the mechanisms or inner workings of a handful of variables. But, they aren’t very good at dealing with a large number of variables at once. The major advantage of machine learning is that it can handle a very large number of predictors, and tell you which ones are really driving prediction, as well as how well they are performing as a group. So, the goal of the project was to take all of the many many variables that have already been examined in separate studies, and make them directly compete for that variance. Which of these hundreds of measures are most important, and when taken as a whole, how well do they perform?
热血热胜红日光
这个是给中年人的
Q: Do these factors change in order of importance with age? Is there any set of factors that predicts divorce?
A: In fact, age was one of the only demographic variables that performed well in our models. Age contributed to 68% of the models we tested. Now, machine learning is pretty black boxy, so we can’t tell you exactly what age is doing in these models. But it’s quite possible that it’s a moderator of a lot of the other variables—that different variables are important for relationship quality depending on your age. We did not try to predict divorce or breakups in these models. Other papers have done that though, although not with machine learning. Karney & Bradbury 1995 (https://psycnet.apa.org/buy/1995-36558-001 ) is, I believe, still the most comprehensive paper to date on the predictors of divorce. Le et al 2010 (https://onlinelibrary.wiley.com/doi/full/10.1111/j.1475-6811.2010.01285.x?casa_token=pSw5wWgnZSYAAAAA%3ANGeIEDkDNcUmWWi4XiZN1gXDX4F8zMGP98V_O7sWkaW-Z8N0XZ0IuoJNoaSWAwHlZstwN_18X99JT8WQ) is the best paper on predictors of breakups. Top predictors of divorce and breakups tend to be global evaluations of the relationship. Variables like how satisfied you are in the relationship, and how committed you are to the relationship. That’s part of why we focused on these outcome variables in our project.
热血热胜红日光
这个是给想入非非要open marriage的有些同学的
Q: Did you study partners with open relationships? Do you believe that open relationships can be long lasting and fulfilling? Thank you for all the hard work. It''s incredibly intriguing. I''ll have a lot to read up on tonight.
A: This project didn’t really touch on open relationships, but I have done other work in this area. A couple of years ago, one of my students recruited 233 people who were interested in opening up their relationships—but hadn’t done so yet—and tracked them over two months. https://journals.sagepub.com/doi/abs/10.1177/1948550619897157 We found no differences in relationship quality between those who opened up over the course of the study and those who didn’t. We did find increases in sexual satisfaction for those who opened up. This is consistent with other, cross-sectional work on open relationships. So, we don’t have definitive answers yet, but so far, the data are looking promising for open relationships!
Q: Dr Joel, I am curious about how this data would hold up in a Polyamorous relationship. Would it be possible to use this research to see how having multiple partners could increase/decrease relationship satisfaction? I’m sure that research is much less pertinent, but I am definitely intrigued by the notion that it could be possible to have higher relationship satisfaction the more partners you have or even the opposite.
A: Polyamory and other kinds of non-monogamous relationships are a burgeoning research area right now, and we don''t have a ton of data on them yet. Most of the data in the current project did not ask couples whether they were monogamous or not. However, other papers have been published that touch on this question, and the limited data we have so far suggests no differences in relationship quality between monogamous relationships and consensually non-monogamous ones. My lab has published one paper on the topic, which you can find here: https://journals.sagepub.com/doi/full/10.1177/1948550619897157?casa_token=OTSqSbG0yqUAAAAA%3AAU6UKKPGUrcM-OWD1b2h9-RnxqygpD6-z70aBKLSShnHHqT1axAtdEIWRbC2U9QEyaJKfpIs0_0ODw
q
qiqi_hua
为什么潮水这次不说,it takes a model to beat a model?
Random forest,类似decision tree,依稀记得是log/sigmoid来计算减小entropy and cross entropy. 模型inputs的要求,非常重要的一点,就是independence.
但是,看看图表上面列出的constructs,age和其他的很多都有明显管理。如果dataset小,很容易over fitting。
热血热胜红日光
这个是给离过婚的同学的
Q: How do [你] control for the self-reported nature of the data? I would imagine people would be biased in their description of their current relationship compared to past relationships or the prospect of a future one. More plainly, I would expect Ex's to have a largely negative connotation and re-entering the dating pool requires substantial effort; so I may respond more positively about my current relationship.
A: Absolutely – people tend to hold a lot of positive illusions about their romantic partners, and to perceive their partners in a highly biased way. But, I think I would push back on the idea that this is something that needs to be controlled for or somehow subtracted from the ratings. When we’re talking about relationship quality, really, perception is reality. You’re happy if you think you’re happy! It’s an inherently subjective construct. I think that’s why own traits did such a better job of predicting relationship quality than the partner’s traits, in these analyses. Your own proneness to things like positive and negative affect are going to shape how you perceive your partner and the relationship, and therefore how satisfied you are with that relationship. To a large extent, we project our own personalities, feelings, biases, etc. onto our partners.
t
tidewater
这个是给马工和统计学家的
Q: Thanks for doing this Dr. Joel! Very interesting research. What made you think machine learning would be a good way to study the success of romantic relationships?
A: Well, traditional statistical methods that we use in this field—like regression and multilevel modelling--are really great for delving into the mechanisms or inner workings of a handful of variables. But, they aren’t very good at dealing with a large number of variables at once. The major advantage of machine learning is that it can handle a very large number of predictors, and tell you which ones are really driving prediction, as well as how well they are performing as a group. So, the goal of the project was to take all of the many many variables that have already been examined in separate studies, and make them directly compete for that variance. Which of these hundreds of measures are most important, and when taken as a whole, how well do they perform?
热血热胜红日光 发表于 2020-08-07 12:45

traditional ML 确实是两人问题:
第一:数据量一大就躺倒。
第二其实是最严重的,traditional ML 模型的 linear 假设。如果是非线性模型,都得写个 kernel function。但 Kernel function 就是人为假设。而 Neural network 是 piecewise linear (ReLU neuron) 最后加 sigmoid 或者 soft max 平滑。而所有的 internal parameters (不是 hyper parameters) 都是 learn from data。大幅度降低模型里的人工干预造成的附加 bias 。
t
tidewater
为什么潮水这次不说,it takes a model to beat a model?
Random forest,类似decision tree,依稀记得是log/sigmoid来计算减小entropy and cross entropy. 模型inputs的要求,非常重要的一点,就是independence.
但是,看看图表上面列出的constructs,age和其他的很多都有明显管理。如果dataset小,很容易over fitting。
qiqi_hua 发表于 2020-08-07 12:50

前面说了,inherent 的线性假设,或者人工写的 kernel function。
我喜欢帅哥
哈哈哈哈。两位大叔终于一起出现了。
q
qiqi_hua
回复 8楼热血热胜红日光的帖子
这个是给还没结婚, 正在找朋友的:
Q: I'm an extrovert and I've been intensely unhappy dating introverts. So this seems to go against my own experiences, because there's not enough in common between us to keep a relationship going, and I don't feel that they care about me enough to compromise (e.g. they agree to attend game night with me once a month vs weekly). A: I think this really highlights that self-selection problem I mentioned—your relationships with introverts may not last long enough to be included in a study like this, which means those data are not part of the results. That’s why I really want to see more data on fledgling relationships. I’d love to enroll you in a study at the point when you have just started dating an introvert, and ask you about your experiences over those few ephemeral weeks or months that the relationship lasts before it fizzles out. Those sorts of data are so difficult to collect but I think they’re a really important piece of the puzzle. Q: Well I've been with my introverted husband for nine years. We've decided just recently that separation is probably the best course of action in the future (neither of us want to make such a large decision right now, in the midst of the world being on pause and both of us being depressed about it). A: I'm really sorry to hear that, Transplanted_Cactus.

热血热胜红日光 发表于 2020-08-07 12:41

不不不,我觉得,给未婚小姑娘最重要的建议,其实是这一句: the person you choose may not be as important as the relationship you build。
婚姻是扇门,修行靠个人。
热血热胜红日光
这个是不是 neural network supervised learning multi-class classifier, input is feature vectors?
Feature vector 不一定总是有很好的 pattern,因为 noise 的存在。Loss graph 有时候下不去。一万个数据点也不一定够大。
另外有时候需要人工 feature cross 。
不过目的是 reveal correlation 而不是真的要 build classifier 就没有那么多要求。
tidewater 发表于 2020-08-07 12:43

我没看到论文全文...不是做学术的, 没有订阅啊. 说不定班上有哪位是研究心理学, 社会学, 或者商学院的教授... 也许能看到全文做个总结
t
tidewater
为什么潮水这次不说,it takes a model to beat a model?
Random forest,类似decision tree,依稀记得是log/sigmoid来计算减小entropy and cross entropy. 模型inputs的要求,非常重要的一点,就是independence.
但是,看看图表上面列出的constructs,age和其他的很多都有明显管理。如果dataset小,很容易over fitting。
qiqi_hua 发表于 2020-08-07 12:50

data set 小和 overfit 是有可能有问题。
Overfit 的问题 Network 的 number of parameters 相对 dataset 的 size 不能太大。考虑加 L2 Regularization。
Chaoes effect 的问题,同时 rotate random seed 看结果是不是稳定。
热血热胜红日光
为什么潮水这次不说,it takes a model to beat a model?
Random forest,类似decision tree,依稀记得是log/sigmoid来计算减小entropy and cross entropy. 模型inputs的要求,非常重要的一点,就是independence.
但是,看看图表上面列出的constructs,age和其他的很多都有明显管理。如果dataset小,很容易over fitting。
qiqi_hua 发表于 2020-08-07 12:50

这个数据好像是11,196对夫妇. 当然要是能有跟多数据肯定更好了... 做社科研究的弄数据不容易啊, 找这么多人做实验, 要花好多时间好多钱呢
q
qiqi_hua
哈哈哈哈。两位大叔终于一起出现了。
我喜欢帅哥 发表于 2020-08-07 12:56

这2位经常一起出现,用科学(data models)+哲学+生物学+一点点宗教学+行为/思维推断中探讨“男女"。有时候感觉大材小用了;不过,再想想,”男女“才是推动世界的原动力啊。
2位继续!加油!
t
tidewater
这个数据好像是11,196对夫妇. 当然要是能有跟多数据肯定更好了... 做社科研究的弄数据不容易啊, 找这么多人做实验, 要花好多时间好多钱呢
热血热胜红日光 发表于 2020-08-07 13:00

传统方法的数据量太小。
要数据量大就得上穿戴式玩具人手一个,然后数据上传。一百兆数据点加 time series 以及 geometric information,上 big query 和 cloud distribute training 。。。
不过 cloud 也不便宜。更麻烦的是隐私和伦理问题。
热血热胜红日光
不不不,我觉得,给未婚小姑娘最重要的建议,其实是这一句: the person you choose may not be as important as the relationship you build。
婚姻是扇门,修行靠个人。
qiqi_hua 发表于 2020-08-07 12:57

你的建议也是作者的最重要发现之一... 但是是条件的哦:
婚姻是扇门, 修行靠个人很对, 但是跟谁携手进这扇门呢? 这个研究的所有数据来源是已经在relationship之中的(应该包括已婚和未婚的, 可能还包括同性的)
我觉得婚前把好关很重要, 看看性格合不合得来. 这里的例子就是外向型女的找了内向型男的, 最后不得不分手(当然主要是当事人估计没有好好修行)
q
qiqi_hua
这个数据好像是11,196对夫妇. 当然要是能有跟多数据肯定更好了... 做社科研究的弄数据不容易啊, 找这么多人做实验, 要花好多时间好多钱呢
热血热胜红日光 发表于 2020-08-07 13:00

其实,这种调查有明显偏差:愿意加入调查的,内心里要么是好奇,要么是认识到有问题。input dataset已经biased。
我有个想法,也许潮水可以考虑:不用survey results做inputs,直接在social media上获取资料。用NLP做词语的属性分类,post的negative/postive,高频类型/语言/图片。然后加上age/kids/job这些。用temporal类型的model,因为婚姻质量不是one-time regression result, 是长期的。
t
tidewater
为什么潮水这次不说,it takes a model to beat a model?
Random forest,类似decision tree,依稀记得是log/sigmoid来计算减小entropy and cross entropy. 模型inputs的要求,非常重要的一点,就是independence.
但是,看看图表上面列出的constructs,age和其他的很多都有明显管理。如果dataset小,很容易over fitting。
qiqi_hua 发表于 2020-08-07 12:50

另外 neural network 的 classifier 也是 cross entropy loss,实际上就是 log loss 。。。 regressor 才用 mean square loss 。。。 否则数学上发散
t
tidewater
其实,这种调查有明显偏差:愿意加入调查的,内心里要么是好奇,要么是认识到有问题。input dataset已经biased。
我有个想法,也许潮水可以考虑:不用survey results做inputs,直接在social media上获取资料。用NLP做词语的属性分类,post的negative/postive,高频类型/语言/图片。然后加上age/kids/job这些。用temporal类型的model,因为婚姻质量不是one-time regression result, 是长期的。
qiqi_hua 发表于 2020-08-07 13:10

对对对,研究 quora 大数据一直是我的梦想。
不过人类目前的 Natural Language Processing 还太落后 。。。 主要是句子和段落语义理解方面
O
Orangetabby
这个是不是 neural network supervised learning multi-class classifier, input is feature vectors?
Feature vector 不一定总是有很好的 pattern,因为 noise 的存在。Loss graph 有时候下不去。一万个数据点也不一定够大。
另外有时候需要人工 feature cross 。
不过目的是 reveal correlation 而不是真的要 build classifier 就没有那么多要求。
tidewater 发表于 2020-08-07 12:43

找feature的correlation的话 pca不就行了?
q
qiqi_hua
传统方法的数据量太小。
要数据量大就得上穿戴式玩具人手一个,然后数据上传。一百兆数据点加 time series 以及 geometric information,上 big query 和 cloud distribute training 。。。
不过 cloud 也不便宜。更麻烦的是隐私和伦理问题。
tidewater 发表于 2020-08-07 13:06

wearable, + IoT + data processing at edge + AI at cloud. 你这是提前进入社会主义啊。
不过,你怎么知道Alexa没有这么做呢?不必人体佩戴仪器,室内设备上面安装类似雷达的sensor,ToF记录bounced back signal,FFT处理都可以用来监视呼吸。
q
qiqi_hua
你的建议也是作者的最重要发现之一... 但是是条件的哦:
婚姻是扇门, 修行靠个人很对, 但是跟谁携手进这扇门呢? 这个研究的所有数据来源是已经在relationship之中的(应该包括已婚和未婚的, 可能还包括同性的)
我觉得婚前把好关很重要, 看看性格合不合得来. 这里的例子就是外向型女的找了内向型男的, 最后不得不分手(当然主要是当事人估计没有好好修行)
热血热胜红日光 发表于 2020-08-07 13:08

性格这个,不是一成不变的。这个model,我认为有些flaw。
晚上再来看你们的讨论结果,我去回忆我的职场生涯,给leadership principle加点料去。
热血热胜红日光
wearable, + IoT + data processing at edge + AI at cloud. 你这是提前进入社会主义啊。
不过,你怎么知道Alexa没有这么做呢?不必人体佩戴仪器,室内设备上面安装类似雷达的sensor,ToF记录bounced back signal,FFT处理都可以用来监视呼吸。
qiqi_hua 发表于 2020-08-07 13:17

我感觉Apple已经偷偷做了好多年了...
不是说带着Apple watch办事的时候, 数据都拿去后面服务器分析过了吗. 所有潮水哥经常提到的那些夫妻亲热的详细数据应该是Steve Jobs懂得最多啊
他看了以后悲天悯人, 想来想去对人类男女失望至极... 决定把掌门重任交给了对男女之事不感兴趣的Tim Cook
q
qiqi_hua
我感觉Apple已经偷偷做了好多年了...
不是说带着Apple watch办事的时候, 数据都拿去后面服务器分析过了吗. 所有潮水哥经常提到的那些夫妻亲热的详细数据应该是Steve Jobs懂得最多啊
他看了以后悲天悯人, 想来想去对人类男女失望至极... 决定把掌门重任交给了对男女之事不感兴趣的Tim Cook
热血热胜红日光 发表于 2020-08-07 13:25

我其实是想说这个例子;但是,自持身份,不能随便对后辈如此取笑啊。还是您厉害
t
tidewater
找feature的correlation的话 pca不就行了?
Orangetabby 发表于 2020-08-07 13:16

PCA 是基于数据分布呈现 ellipsoid 的假设?
其实一个乘法关系就可以让绝大多数统计模型躺倒 。。。
c
chinadrachen
不不不,我觉得,给未婚小姑娘最重要的建议,其实是这一句: the person you choose may not be as important as the relationship you build。
婚姻是扇门,修行靠个人。
qiqi_hua 发表于 2020-08-07 12:57

我的理解是,person you choose 还是很重要的,但是哪怕对的人,不好好经营,也会悲剧。但是如果一开始choose的就是错的人,那怎么经营估计都是事倍功半吧…
c
chinadrachen
我感觉Apple已经偷偷做了好多年了...
不是说带着Apple watch办事的时候, 数据都拿去后面服务器分析过了吗. 所有潮水哥经常提到的那些夫妻亲热的详细数据应该是Steve Jobs懂得最多啊
他看了以后悲天悯人, 想来想去对人类男女失望至极... 决定把掌门重任交给了对男女之事不感兴趣的Tim Cook
热血热胜红日光 发表于 2020-08-07 13:25

这也太可怕了… 还有没有privacy了。特别是alexa的场景。
c
chickenrib
婚姻需要双方共同经营,有什么奇怪吗?不付出怎么会有收获?婚姻应该是人一辈子最大的投资之一呀。
n
novavista
为什么潮水这次不说,it takes a model to beat a model?
Random forest,类似decision tree,依稀记得是log/sigmoid来计算减小entropy and cross entropy. 模型inputs的要求,非常重要的一点,就是independence.
但是,看看图表上面列出的constructs,age和其他的很多都有明显管理。如果dataset小,很容易over fitting。
qiqi_hua 发表于 2020-08-07 12:50

你说的是logistic regression,决策树和随机森林是通过splitting data into subsets做预测的
c
chatchat
我的理解是,person you choose 还是很重要的,但是哪怕对的人,不好好经营,也会悲剧。但是如果一开始choose的就是错的人,那怎么经营估计都是事倍功半吧…
chinadrachen 发表于 2020-08-07 13:34

我也这么想。 就象选运动员,先要选对苗子,再辅之以正确的培养,才可能成材。
H
Heiniu
我感觉Apple已经偷偷做了好多年了...
不是说带着Apple watch办事的时候, 数据都拿去后面服务器分析过了吗. 所有潮水哥经常提到的那些夫妻亲热的详细数据应该是Steve Jobs懂得最多啊
他看了以后悲天悯人, 想来想去对人类男女失望至极... 决定把掌门重任交给了对男女之事不感兴趣的Tim Cook
热血热胜红日光 发表于 2020-08-07 13:25

笑死了

t
tidewater
你说的是logistic regression,决策树和随机森林是通过splitting data into subsets做预测的
novavista 发表于 2020-08-07 13:40

在 SDE 里,hard partition based algorithm 很多时候的缺点就是在 early stage 要回答莎士比亚的 to be, or not to be 的经典难题 。。。除了比如 sorting 这种单纯问题,很难 scale up。
Non-linear programming 以及各种 variation method 里的 hierarchical approach,也不是真的一点不做 partition,只是设法做 implied soft partition 来避免算法的早期阶段就要硬硬地回答莎士比亚的难题。
t
tidewater
我感觉Apple已经偷偷做了好多年了...
不是说带着Apple watch办事的时候, 数据都拿去后面服务器分析过了吗. 所有潮水哥经常提到的那些夫妻亲热的详细数据应该是Steve Jobs懂得最多啊
他看了以后悲天悯人, 想来想去对人类男女失望至极... 决定把掌门重任交给了对男女之事不感兴趣的Tim Cook
热血热胜红日光 发表于 2020-08-07 13:25

赞👍就一个字!!!
热血热胜红日光
我的理解是,person you choose 还是很重要的,但是哪怕对的人,不好好经营,也会悲剧。但是如果一开始choose的就是错的人,那怎么经营估计都是事倍功半吧…
chinadrachen 发表于 2020-08-07 13:34

我也是这么认为的. qiqi_hua妹妹也说了这个数据组有self selection bias
热血热胜红日光
我其实是想说这个例子;但是,自持身份,不能随便对后辈如此取笑啊。还是您厉害
qiqi_hua 发表于 2020-08-07 13:27

哎呀, 你这么一说, 我才反应过来...唐突了唐突了
我喜欢帅哥
这2位经常一起出现,用科学(data models)+哲学+生物学+一点点宗教学+行为/思维推断中探讨“男女"。有时候感觉大材小用了;不过,再想想,”男女“才是推动世界的原动力啊。
2位继续!加油!
qiqi_hua 发表于 2020-08-07 13:03

两位都是专家,又是对男女关系很有兴趣的。
a
artdong
The person you choose may not be as important as the relationship you build. As a culture, we put so much emphasis on choosing the right person. These results suggest that it’s really more important to be the right person. To create the conditions that will allow a relationship to flourish.
这个总结很精辟啊,赞一个。

热血热胜红日光
两位都是专家,又是对男女关系很有兴趣的。
我喜欢帅哥 发表于 2020-08-07 14:18

这么一说我怪不好意思的...下次一定注意
热血热胜红日光
这也太可怕了… 还有没有privacy了。特别是alexa的场景。
chinadrachen 发表于 2020-08-07 13:37

我是乱开玩笑的, 实际上大部分公司的数据采集和数据分析还做不到这个程度. 我知道有制药公司开始用wearable做R&D, 但是不可能全部自动化. 比如给病人(或实验志愿者)发个wearable device, 数据不是实时传递给制药公司研究部门的, 还是先到组织clinical trial的诊所, 由医生和护士录入数据
这里面有很多考虑, 特别是跟FDA的GxP要求有关(GCP)
a
ab18
为什么潮水这次不说,it takes a model to beat a model?
Random forest,类似decision tree,依稀记得是log/sigmoid来计算减小entropy and cross entropy. 模型inputs的要求,非常重要的一点,就是independence.
但是,看看图表上面列出的constructs,age和其他的很多都有明显管理。如果dataset小,很容易over fitting。
qiqi_hua 发表于 2020-08-07 12:50

什么时候RF model也要求inputs independent了,倒 RF比起简单的linear regression最大的好处就是可以model一些复杂的局部interaction
a
ab18
在 SDE 里,hard partition based algorithm 很多时候的缺点就是在 early stage 要回答莎士比亚的 to be, or not to be 的经典难题 。。。除了比如 sorting 这种单纯问题,很难 scale up。
Non-linear programming 以及各种 variation method 里的 hierarchical approach,也不是真的一点不做 partition,只是设法做 implied soft partition 来避免算法的早期阶段就要硬硬地回答莎士比亚的难题。
tidewater 发表于 2020-08-07 13:48

RF是个assemble model,一定程度上避免了非常硬的partition 不过的确没法scale,但是这是个科学研究,样本又不大
O
Orangetabby
PCA 是基于数据分布呈现 ellipsoid 的假设?
其实一个乘法关系就可以让绝大多数统计模型躺倒 。。。
tidewater 发表于 2020-08-07 13:31

pca是把feature vector project到另外space上 reduce dimension用的
热血热胜红日光
才发现班上一群机器学习科学家啊...久仰久仰, 失敬失敬
C
CleverBeaver
这个
明显不能支持causal 好不好
ml就是这么给搞坏的
C
CleverBeaver
The person you choose may not be as important as the relationship you build. As a culture, we put so much emphasis on choosing the right person. These results suggest that it’s really more important to be the right person. To create the conditions that will allow a relationship to flourish.
这个总结很精辟啊,赞一个。


artdong 发表于 2020-08-07 14:23

这个听上去符合逻辑 但是不需要采集什么数据我也知道这点啊
热血热胜红日光
这个
明显不能支持causal 好不好
ml就是这么给搞坏的
CleverBeaver 发表于 2020-08-07 15:25

真正的cause没办法计量啊, 怎么采集数据? 难道真的像潮水哥说的, 每个人都得带着wearable? 那真的是<1984>场景重现了.
热血热胜红日光
这个听上去符合逻辑 但是不需要采集什么数据我也知道这点啊
CleverBeaver 发表于 2020-08-07 15:26

你知道的那个只能是猜想(hypothesis), 说不定是算观点(opinion), 缺乏实验数据证实, 不科学
热血热胜红日光
回复 54楼CleverBeaver的帖子
对了...你的ID是跟学校有关吗? 你是MIT还是CalTech出来的Beaver?
q
qiqi_hua
我是乱开玩笑的, 实际上大部分公司的数据采集和数据分析还做不到这个程度. 我知道有制药公司开始用wearable做R&D, 但是不可能全部自动化. 比如给病人(或实验志愿者)发个wearable device, 数据不是实时传递给制药公司研究部门的, 还是先到组织clinical trial的诊所, 由医生和护士录入数据
这里面有很多考虑, 特别是跟FDA的GxP要求有关(GCP)
热血热胜红日光 发表于 2020-08-07 14:42

制药公司收集的数据,包含很多隐私。但是,一般用户层次的数据,比如说,我们手机里面的数据,end users根本不知道多少内容被收集处理并加以利用了。
我说一个场景: 很多人会选择用公共场合的free WIFI。链接之前需要click yes,这个里面大有内容。如果你在商场里面乱逛,WIFI可以定位你在哪里,你行动路线,你在哪个店停留,你是否购买。这些数据有没有用?用处大了:第一,可以卖给商户。第二,可以定向推送广告,什么银行的,保健品的。第三,可以预测消费模式,你属于什么类型消费者,什么层次;你下次可能买什么。今天你买了什么什么玩具,过几个月就可以给你看看母婴产品。。。。
还有的可能情况,倒不一定是智能家居收集客户信息,而是hacker利用智能家居截取客户信息。前几年不是海康威视被禁止适用于美国政府部门,也是现在Tiktok的前传。

米菲兔
顶锅盖说,这种基于survey的也只能发PNAS或者心理学journal了(按照现在b-school比较quantitative的几个field评价标准)。。。self-reported/self-stated preference跟revealed preference(这个要靠alexa之类直接收集的behavioral data)之间能差很多的。最近review一个关于energy consumption的,ETH的同学们都是直接开发了一个系统然后run field experiment, 大概是一年的数据,当然,marriage/relationship这种一年也许不够,但是看具体research question。这个research里ML的technique不是卖点吧,标准的RF跑了一下。。。PNAS现在主要是以故事情节取胜的。要拼technique, 跟NIPS/ICML的差得是decades吧
热血热胜红日光
回复 59楼米菲兔的帖子
心理学有关的研究都有这个问题. 很难做定量分析, 因为基本没有办法直接观察被研究问题的准确状态. 心理学还是要靠讲故事
能多说说energy consumption的研究吗? 是根气候变化对大家心态的影响? 还是用不同的定价机制什么的观察consumer behavior?
现在有不少汽车保险公司开始用新的定价机制, 车里装个小装置连载OBD II(车载计算机总线接口)上, 收集分析驾车人的行为数据, 以此来作为定价基础. 要是能拿到这个数据做研究, 在overlay天气, 路况, 时间什么的, 绝对能做出很漂亮的模型啊
s
shanshuipinglan
这是一个做模型做的走火入魔的楼。还能不能愉快滴办事了。。。尼玛办事时想想data collection再想想DOE和建模。。。。人类就是这样灭绝的。。。。
热血热胜红日光
这是一个做模型做的走火入魔的楼。还能不能愉快滴办事了。。。尼玛办事时想想data collection再想想DOE和建模。。。。人类就是这样灭绝的。。。。
shanshuipinglan 发表于 2020-08-07 22:17

你也是跟潮水哥一样的建模同好么...这个楼的本意是让广大马工放松
想想你写的马将来有一天能用来做更重要的事情...对人类精神世界, 对人类内心的探索 - 其实这也是另一个次元的星辰大海啊