Wednesday, December 25, 2013

Baidu IME, Simeji (not really) sending keystrokes to outside servers

UPDATE: Much ado about nothing

The NHK piece I watched this morning turns out to have been total crap and essentially a staged sending of a password. My apologies for being duped. I should have seen through the bullshit, and I'll explain why below.

But first, the security company that was featured has posted a clarification on their blog. Both the Baidu IME and Simeji are doing cloud conversion of Japanese text. That is, conversion of 2-byte hiragana(全角文字)to kanji. So to do this, it sends all 2-byte text to the cloud, and they claim the text is sent even when the option is turned off. So, yes, this would seem to be a bug. If the cloud option is off, nothing should be done in the cloud.

However, it does not send standard (single byte) text at all.

Credit card numbers and passwords are always in single byte text, which means that neither the Baidu IME or simeji would have sent them, and the clarification explains just that:
Baidu IME , Simejiでは、全角入力の場合のみ情報が送信されています。クラウド入力Offの場合でも入力文字列を送信していました。パスワードなど半角入力のみの場合は送信されていません。クレジット番号や電話番号も変換しなければ送られません。
The Baidu IME and Simeji only send information when text is entered as two byte text. This happens even when cloud input is turned off. Passwords that are in single byte text are not sent. Credit card and phone numbers are also not sent if they do not require conversion. [Emphasis is original]
What this last sentence says is that if you enter the numbers in two byte text, e.g., 1234, then it will be sent since it is a conversion candidate.

Heres the thing: no one ever enters passwords or credit card number as two byte text, so the cases that they would have been sent are essentially zero. You cannot enter a credit card number (partially for this reason) as two-byte text on any e-commerce site.

The staged password theft

Getting to how all this go started, they used a phrase in Japanese that was essentially "1234 is a password," and it was done as 1234はパスワードです。(Or something like that). The camera then zooms in on a computer monitor that is capturing and displaying the Baidu IME's communication with the cloud server and they show 1234 being sent. At the time, I was thinking, "who uses 2-byte text for passwords?"

And the answer is:

A security company being broadcast on national TV uses 2-byte text for a password when that is the only way to trigger the reaction they want, even when it's a totally impossible situtation. The whole thing was staged. NHK is usually much better than this.

Original post

According to NHK  [J], the Baidu IME for PCs is sending all keystrokes, plus application and computer information, to outside servers even when the settings are explicitly set to not send information.

It's less clear what is happening with Simeji android IME. I haven't used it in years since adamrocker sold it to Baidu. With Simeji, it could just be that it is set to send and receive data by default, as opposed to sending data always, regardless of preference settings. Either way, I recommend avoiding it for the Google Japanese IME (insert NSA joke here).

This is very different from other IMEs

Of course all IMEs have the ability to send data back home. This allows for new words to be added as they become commonly used and for general improvements to input*. The difference here are major. According to the NHK article, both Google and Just Systems (maker of ATOK) send anonymized usage statistics with explicit permission from the user. That is, sending information is opt-in.

Baidu on the other hand does the exact opposite. Data are sent by default. Data are not anonymized. Raw text input is sent. You cannot opt out. If you do opt out, data are sent anyway.


* I'd argue that Swype, while I initially praised it's Japanese input, does actively not collect any information about how Japanese is input. None of these suggestions or bug has been fixed or implemented. I feel like I basically had to teach Japanese grammar to the swype keyboard, but with any complex sentence structure, forget about swyping in Japanese.

Saturday, December 21, 2013

Japan Communications to bring landline numbers to Japanese mobile phones


Frank Sanda, general telecom disrupter and founder and CEO of Japan Communications Inc., the provider of b-mobile products, recently announced on twitter that they are "about bring Tokyo landline numbers to mobile phones". That is, you could have a 03-xxxx-xxxx phone number attached to your smartphone. This will make your mobile number indistinguishable at a glance from a normal landline phone and will allow people to call you for a fraction of the cost they would incur if they called your mobile number.

He also said that the goal for JCI this next year is to move in the same revenue class (trillion yen) as NTT East and NTT West, and he thinks this "03 Smartphone" will do just that. And he also thinks the NTTs are going to need some help from someone, somewhere.

As far as I can see, the way this is going to happen is using VoIP ("IP softphone") with a "Class A" LTE connection. If so, not only will calling in be cheaper, so will calling out. Docomo charges ¥42 per minute to place an out-of-network call from an LTE phone. Doing the same with VoIP costs one tenth of that price, and if the connection is good, (I'll come back to that later) the voice quality is loads better due to the use of higher-quality codecs.

A bit of background:

Phone Numbers in Japan

In Japan, prefixes are typically reserved for particular types of devices. Mobile phones are allocated numbers beginning with 090, 080, and (recently), 070. Because incoming calls are charged to the caller, calling a mobile phone can be quite expensive. A commonly available option is an iP (VoIP) softphone app for your smartphone that is attached to a 050 number. However, the incoming calls to a 050 number can still incur a premium charge, above the cost of a standard local call. So, the best would be to have a normal number attached to your cellphone, for example a “03” Tokyo prefix.

Using a SIP client with a purchased 03 DID allows this right now, which is similar to what many of us do to get a phone number from “back home”. (I used to use callwithus, but now have a free DID from callcentric at which I point my Google Voice number.) While this is easy and straightforward in most countries, it is a bit more complicated here in Japan (though not impossible).

Quality of Service

Part of the reason why you can’t just get a prepackaged VoIP plan for your mobile phone with a 03 number is because there are minimum requirements before particular prefixes may be allocated. For a standard landline (固定電話) number to be allocated to a VoIP provider, the line must be CLASS A, capable of an R-factor greater than 80 for 95% of the time with less than 100 ms latency (according to wikipedia). A Class B (>70 and <150 ms) line may only be allocated to a 050 number.

A Caveat

LTE is technically Class A and could qualify for a standard landline number allocation.

Technically.

Realistically, I’m a bit skeptical how well this will work out. At peak times, the carriers are beyond capacity. I think the 95% of the time R factor requirement might be hard to meet. NTT East and West, as well as NTT Docomo and NTT Communications - hell, all the NTTs - are going to fight this. I’m sure all but Docomo will use the same argument that LTE doesn’t yet qualify as Class A. On top of this, you have inconsistent behavior with VoIP apps across different smartphones, as well as potential concerns with battery.

Convert existing b-mobile voice plans to the “Free Data” plan

Japan Communications has announced a potentially free upgrade path to the “Free Data” plan for users of their older voice SIMs. You can also keep your existing phone number. (Though I don’t think that many readers here will be interested, the upgrade also includes their new “keitai denwa” SIM, which is a voice+SMS only product for feature phones.)

You can apply for the change in plan from noon on December 27th via the My B-mobile page under “change service” (サービス変更).

However, the cost of upgrading could be significant for some people:
  • ¥2,000 fee is charged to change between a 3G and LTE plan (or vice versa in the case of the “keitai denwa” SIM.
  • Older SIM cards are incompatible and docomo levies a ¥3,000 fee for a new SIM
When changing plans, you’ll need to enter the number on your SIM card. The older white and blue SIM cards with a product number beginning with DN03 or AX03 (i.e., version 3 will have to be changed out.


The SIM with "docomo" written in red on the right is an LTE-capable SIM card. These SIMs are used for newer voice plans, even if the data in only 3G. The SIM with FOMA written in blue on the left is only capable of 3G and must be exchanged.

If you have the FOMA SIM shown on the left, then you will also have a 3G plan, so you're looking at ¥5,000 to upgrade, which is more than just dumping your old plan and getting the Free Data plan new to start. The question becomes, is keeping your existing phone number worth ¥2,000?