-
-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New Check: Utf8EncodingCheck #265
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@romani commented on Jun 7 Byte order mark is not requirement for files - http://en.wikipedia.org/wiki/Byte-order_mark#UTF-8
We might need to port "isutf8" application from C++ to Java, sources https://joeyh.name/code/moreutils/ , file "isutf8.c". Attention: we cannot force to use only utf-8!!!, any ascii is more preferable and should be accepted, see my example above. We might need to use - http://jchardet.sourceforge.net/ , that could give us full functional support for most of encoding detection (not only utf-8). |
@maxvetrenko commented on Aug 31 I read that InputStream uses operation system encoding. All libs read bytes from InputStream, so all already bytes encoded in operation system encoding. |
Here's my investigation of encoding detection by:
I used files of different encoding types with the corresponding content as input on Linux OS (Fedora). The output may be different on Windows OS. We can't say for sure what is the file's encoding. It is not the task for Checkstyle |
Won't fix |
Source files have to be UTF-8 encoded: http://google-styleguide.googlecode.com/svn/trunk/javaguide.html#s2.2-file-encoding
The text was updated successfully, but these errors were encountered: