Wednesday, February 3, 2016

Nguyễn Anh Tuấn

Tôi vừa được biết anh Nguyễn Anh Tuấn trở về Việt Nam, sau vài năm ra nước ngoài làm việc và học tập về tự do và dân chủ. Tôi muốn ghi lại đây, bởi sau này có thể chúng ta sẽ thấy đây là một sự kiện mang tính lịch sử.

Tôi không quen anh Tuấn, chỉ biết anh ấy thông qua một người bạn chung. Đọc những gì anh Tuấn viết và những việc anh ấy làm, tôi ngưỡng mộ sự dũng cảm, kiến thức và kinh nghiệm đấu tranh của anh ấy cho tự do, dân chủ và quyền con người. Trẻ tuổi, tài năng, dũng cảm, trung thực... ở Nguyễn Anh Tuấn hội đủ tố chất của một thủ lĩnh có khả năng tạo ra sự thay đổi đột biến cho xã hội Việt Nam.

Nguyễn Anh Tuấn sinh năm 1990, tốt nghiệp thủ khoa Học Viện Hành Chính Công, được mời vào đảng Cộng Sản, nhưng từ chối, quyết chọn con đường đấu tranh cho dân chủ và tự do ở Việt Nam. Trong ba năm vừa qua, anh Tuấn đã đi đến hơn hai mươi quốc gia, học hỏi và làm việc với các tổ chức quốc tế để thúc đẩy quyền con người, thúc đẩy sự phát triển của xã hội dân sự và đấu tranh vận động cho tiến trình dân chủ hóa Việt Nam.

Rất nhiều người Việt Nam ngưỡng mộ Joshua Wong, thủ lĩnh phong trào biểu tình Ô Dù ở Hồng Kông phản đối chính quyền Trung Quốc. Báo chí Việt Nam cũng liên tục đưa tin về vị thủ lĩnh thiếu niên của Hồng Kông. Tôi nghĩ Nguyễn Anh Tuấn chính là một Joshua Wong của Việt Nam. Thậm chí con đường đấu tranh Nguyễn Anh Tuấn đã, đang và sẽ chông gai hơn rất nhiều, vì, không giống như Joshua Wong, ở Việt Nam ít người hiểu và ủng hộ anh ấy. Xã hội Việt Nam chưa có văn hóa ủng hộ tự do và bảo vệ những người đấu tranh cho tự do.

Đa số người dân, một lần nữa đội ơn Bộ Giáo Dục, coi những người như anh Tuấn là thành phần phản động, cần phải tránh xa và phản đối. Họ không hiểu rằng xã hội muốn phát triển luôn cần phải có những người như Nguyễn Anh Tuấn. Họ không hiểu rằng anh Tuấn và những người bạn của anh ấy như luật sư Trịnh Hữu Long, nhà báo Đoan Trang, v.v. đấu tranh không chỉ cho tự do và quyền con người của bản thân, mà cho tất cả mọi người, kể cả những ai xem họ là phản động. Thời buổi này, ít ai còn tin có những người như anh Tuấn, dám hi sinh lợi ích và tự do của bản thân vì lợi ích và tự do của người khác.

Tôi viết bài này không phải để ca tụng Nguyễn Anh Tuấn. Tôi nghĩ anh Tuấn không cần sự ca tụng của tôi. Tôi tin Nguyễn Anh Tuấn sẽ có những đóng góp chất lượng góp phần phát triển xã hội Việt Nam, nhưng tôi sẽ không thất vọng nếu anh ấy không làm được gì. Nếu anh Tuấn, một ngày nào đó mỏi gối chồn chân, thôi không đấu tranh nữa, tôi cũng sẽ không ngạc nhiên hay thất vọng. Muốn sống thế nào là quyền tự do, bất khả xâm phạm, của mỗi người.

Tôi nói về anh Tuấn ở đây vì những người như anh Tuấn cần được biết đến nhiều hơn, cần được ủng hộ nhiều hơn. Blog này chẳng mấy người đọc, nhưng ít còn hơn không. Nhóm của anh Tuấn, dẫu có tài năng và nhiệt tình cỡ nào, nếu không có sự ủng hộ của nhiều người, khó lòng tạo ra bất kỳ sự thay đổi nào. Nhưng cũng đừng ủng hộ một cách mù quáng, mà hãy đọc những gì họ viết và quan sát những gì họ làm, rồi suy nghĩ xem họ có đáng để được ủng hộ hay không. Tôi đã làm như vậy và câu trả lời của tôi là có.

Tự do không phải tự dưng mà có. Nguyễn Anh Tuấn và nhóm của anh ấy không thể đem lại tự do cho tất cả mọi người. Muốn có tự do, mỗi người phải tự thân vận động. Đừng nghĩ rằng có người đấu tranh cho mình rồi, mình không cần phải đấu tranh nữa. Tự do của mình, mình phải can dự. Tự do của mình, mình phải đấu tranh. Không có cách nào khác đâu. Nếu cứ ngồi chờ lãnh tụ, có thể lãnh tụ đâu không thấy, toàn thấy độc tài.

Exploiting the Diffie-Hellman bug in socat

A few days ago socat, a popular networking tool, issued a curious sounding security advisory:

"In the OpenSSL address implementation the hard coded 1024 bit DH p parameter was not prime. The effective cryptographic strength of a key exchange using these parameters was weaker than the one one could get by using a prime p. Moreover, since there is no indication of how these parameters were chosen, the existence of a trapdoor that makes possible for an eavesdropper to recover the shared secret from a key exchange that uses them cannot be ruled out."

More background information on this vulnerability can be found on Ars Technica and Hacker News, in this post I want to focus on building an exploit.

The patch shows that

p = 143319364394905942617148968085785991039146683740268996579566827015580969124702493833109074343879894586653465192222251909074832038151585448034731101690454685781999248641772509287801359980318348021809541131200479989220793925941518568143721972993251823166164933334796625008174851430377966394594186901123322297453

It has been discovered that

p = 271 * 13597 * 38894884397634366007356454548332370646972724268802781973440208895542936165564656473524541403310393405820598366261673173802130771236325314878371830363723788045821711985461441675679316058246609104355161134470046705337593170498462616195650378975298117141144096886684800236261920005248055422089305813639519

The last factor, let's denote it F3889, is a 1002-bit non-prime integer, whose factors remain unknown. By trial division we know that its smallest factors are larger than $2^{40}$. If we want to go further, we'll need to deploy a proper factorization algorithm. In that case I'll choose Lenstra elliptic curve factorization, whose running time depends on the size of smallest factor of F3889 rather than the size of the number itself. If the smallest factor is smaller than $2^{250}$, the Lenstra's algorithm would recover it before too long. The other factors can then be recovered by either Lenstra's or the general number field sieve algorithm.

On the other hand if F3889 is a product of two 500-bit primes, chances are we might never be able to factor it (without spending a million of dollars or so). This is very unlikely, if $p$ was indeed randomly generated. Thus, it's reasonable to assume that anyone determined and lucky enough can factor F3889 completely. It'll be a fun project, let me know if you want to work on it :).

But, you might ask, why do we care so much about the factors of $p$? It seems to have nothing to do with solving the discrete log problem (DLP) on $Z_p$, doesn't it? The answer is yes, knowing the factors of $p$, and if they are small enough, allows one to solve DLP on $Z_p$, thanks to the Chinese Remainder Theorem (CRT).

As I wrote before on this blog, CRT is one of the most powerful cryptanalysis tools ever. I've seen countless systems broken because of it. Whenever analyzing or designing a new system, ask yourself if you can break it using this simple trick, and you'll be surprised that most of the times the answer is yes. Pohlig and Hellman probably asked this question themselves, and eventually figured out that if the order of a group is a product of small primes (i.e., a smooth number), one can solve DLP in the subgroups, which is easier, and combine the results using CRT to obtain the discrete log in the original group.

Let's look at an example. Suppose that we want to solve for $x$, given $g$ and $h = g^x \pmod{n}$, where the order of group $Z_n$ is $\phi(n) = q_1 * q_2$ with $q_1$ and $q_2$ are prime. We can break this problem into three smaller sub-problems:
1/ Find $x_1$ such that $h^{q_1} = (g^{q_1})^{x_1} \pmod{n}$
2/ Find $x_2$ such that $h^{q_2} = (g^{q_2})^{x_2} \pmod{n}$
3/ We can prove that $x = x_1 \pmod{q_2}$ and $x = x_2 \pmod{q_1}$, and thus can figure out $x$ with CRT.

Note that 1/ and 2/ are themselves DLP, but they are in subgroups of order $q_1$ and $q_2$, respectively. Hence, the Pohlig-Hellman algorithm reduces DLP in group of order $q_1 * q_2$ to DLP in group of order $q_1$ or $q_2$. If $q_1$ or $q_2$ or small, we can brute-force for $x_1$ or $x_2$. If they are larger, we can deploy the Pollard's rho or the index calculus algorithm. It has been estimated that an academic team can break discrete log if $q_1$ or $q_2$ is a 768-bit prime and that a nation-state can break a 1024-bit prime.

I hope that it's clearer now why we need to factor $p$. We want to calculate $\phi(p)$ and its factors, which we need to deploy the Pohlig-Hellman attack. Is it surprising that the factors of the order of the group determines how hard DLP is on that group?

If the largest factor of $p$ is smaller than 2^800 it's reasonable to assume that anyone knowing this factor can solve DLP on the multiplicative group $Z_p$. That is, they can find $x$ given $g$ and $h$, where $g^x = h \pmod{p}$; this in turn allows them recovering the shared secret just by passively eavesdropping on the Diffie-Hellman key exchange. Note that if the larger factors of $p$ are not safe prime, i.e., not in the form of $2 * q + 1$ where $q$ is also a prime, the computation cost is even smaller.

In summary you can exploit this bug by following these steps:
1/ Using Lenstra elliptic curve factorization and the general number field sieve to factor $p$ completely, and
2/ Using Pohlig-Hellman and CRT to reduce DLP on the multiplicative group $Z_p$ to the multiplicative group $Z_{p'}$ where $p'$ is the largest factor of $\phi(p)$, and
3/ Using Pollar's rho or index calculus to solve DLP on $Z_{p'}$, and
4/ Sniffing socat traffic and recovering the DH shared secret, and
5/ Profit!

If this is a backdoor, it's trivial for its creators to exploit it, because they can skip 1/ and pre-compute most of 2/ and 3/. Even if this is not a backdoor, I suspect that it doesn't cost more than \$10K on AWS or Google Cloud Platform to perform step 1/, 2/ and 3/. If we're so lucky that F3889 is a product of 3 primes, 2 of which are 250-bit, step 1/ might cost less than \$2K, and pre-computation for step 2/ and step 3/ might cost even less.

Wednesday, January 27, 2016

As someone who cares about privacy, how do I feel working for Google?

(Needless to say, all opinions are mine, I don't speak for my employer or anyone else.)

A friend recently asked me this question

"Thai: I think you have showed a very consistent view on privacy so I would like to ask a rather sensitive question (feel free to not answer): how do you feel working for Google, one of the most notorious companies in tracking and keeping people private data?"

I found my job rewarding. Google in general and my team in particular have done a lot to improve the privacy and security of not only Google users, but the Internet as a whole.

If Google wasn't pushing hard and investing heavily in the past few years, most traffic on the Internet would have still been sent unencrypted or over outdated protocols. Many core developers of OpenSSL, the software package that enables Internet encryption, are my coworker. Many of the most important innovations and researches on Internet encryption were done by Googlers. HSTS, public-key pinning, certificate transparency, etc. you name it, we invented it. TLS 1.3 the latest version of the most important security protocol on the Internet is inspired by QUIC, an in-house protocol that we developed to make the Internet faster and more secure. We also found and fixed many weaknesses in earlier versions of TLS, and put the final nail in the coffin of SSL 3.0, an outdated and insecure protocol that had been used by many big websites.

Google employs top security researchers, and allows us to work on whatever we think would make the Internet a safer place. You won't find anywhere else a large group of people, 500 and still hiring, that care deeper and contribute greater to the security and privacy of the Internet. There is no other company at Google scale that employs a group of world class security researchers (some of them were snatched from NSA or GCHQ, because we want to deprive these agencies of security talents), and lets them do whatever it takes to kill 0-day vulnerabilities.

Have you ever heard of Neel Mehta? Probably not, but he's the guy that discovered Heartbleed, and sitting on the same floor as I am. Michal Zalewski, Tavis Ormandy, etc. many big names in security are working at Google. Michal's AFL literally revolutionizes overnight the practice of finding software vulnerabilities by fuzzing. These days if someone found some cool bug somewhere, chances are that person was using AFL. Tavis's dive into antivirus software has been a very fruitful endeavor, in which he has found numerous critical vulnerabilities that affected millions of Internet users. Totally we have found and fixed thousands of vulnerabilities in many popular software, including Apple's, Microsoft's, etc. If you happen to use a computer or a smart phone at home or at work, many Googlers have worked hard to help you not got hacked.

It'll be naive to conclude that Google has done all this only for altruism's sake. Google needs a safe Internet to conduct our business. People won't use the Internet or even computers if they can get hacked easily. Google's investment in killing software vulnerabilities, in making the Internet safer for everyone, therefore, is a win-win, thus, sustainable, investment.

Re tracking and keeping people private data: I guess by private data you meant web browsing habit? Google mines this data to display better and more relevant ads, which in turn power the free Internet as we see it today. Historically, most people have been not willing to pay for Internet services, thus ads-supported content has been the only sustainable business model that can work at Internet scale. Google pays billions of dollars to publishers (e.g., New York Times, Nguyen Ha Dong of Flabby Bird, etc.) every quarter. Without this source of revenue, the Internet wouldn't reach where it is today.

Google is not alone in this business, but perhaps we are the most successful, thus the most criticized. Most companies track and collect user data. Have you ever thought where the Data from Big Data comes from? It's people private data, mostly. While reading this article, a fellow Googler reminded me that mining private data is not only for displaying ads but also for improving products and in some cases creating new products. He wrote,

"Google Now and Google Photos are good examples. I love these products. How would I use the products without allowing Google to mine my private data? Talking about ads, there are people who don't like ads, but for people who are OK with ads like me, seeing relevant ads is much better than random ads. Leveraging private data is everywhere. You have the option of not providing private data by not using the products."

Google should do better than our competitors, as we always want to keep a high bar for ourselves. This is why we give users a lot of choices to opt out of the personalized ads system. This is why we give users tools to delete their accounts and to completely wipe out or export their data. This is why Google employs hundreds of people whose one and only job is to ensure that our products give users better privacy controls over their private data. Aforementioned Googler is working in Search, and he wrote,

"Google cares deeply about user privacy. I can tell you that all user private data are encrypted at Google. If you work in systems, you probably know that there are many tricks that do not work on encrypted data. In Search we are loosing XX% capacity because of encrypted data. There are many other burden in enforcing this policy in Google beside losing machine resource, e.g. it is hard to debug since normal developer cannot see log trace, etc. Another big company, Facebook, does not encrypt your private data in its data center (it is true 3 years ago, I don't know the current status). This is not to underestimate how much Facebook cares about privacy, but to show that how far Google is willing to go with protecting user information."

When I joined Google four years ago, privacy was neither something that I cared much nor a hot topic. Working at Google has actually made me care more and more about privacy. As the saying goes, with great power comes great responsibility, with all the data that you have entrusted Google with, my team and I take it as our duty to not let you down. Thanks to Snowden, privacy has become an international issue, in which Google has attracted a lot of criticisms, many of which are groundless. There are also a lot of players out there benefiting from spreading FUD about Google. What I want to allude to you in this little rant is that we care very much about the privacy and security of our users and the whole Internet, and we very much want to offer you and other users choices, be it opting out of personalized ads, deleting your personal data, or subscribing to ads-free services.

You can learn more about how we think about and protect your privacy at https://privacy.google.com/. You can manage your security and privacy settings at https://myaccount.google.com. The privacy and security check ups are great tools that shall allow you to protect your accounts and adjust important privacy settings to your preference. Ads is still our and the Internet's cash cow, but we also have many fast-growing non-ads businesses, in which we generate revenue by earning user trust. We've launched many ads-free subscription services such as YouTube Red, Google for Work (a.k.a, Google Apps for Your Domain), etc. Recently we introduced Contributor, I worked on this project, which allows users to support the websites they like and at the same time see fewer ads. In sum, if you are willing to pay for services Google provides many ads-free choices.

I'll leave Google one day, but I'll be a loyal user for life. Google has simply earned my trust. I trust that none, even Google employees or any three letter agencies, can unauthorizedly read my emails, view my photos, download my documents, etc. without having to step over the corpses of an army of very skilled engineers or lawyers, working very hard to protect whatever data that I've entrusted Google with. And you should too.