开发者

Read file path from utf-8 text file?

开发者 https://www.devze.com 2023-04-13 04:08 出处:网络
I have a UTF-8 text file example.txt 开发者_Go百科that contains: c:/temp/file.txt I read the file content using this method:

I have a UTF-8 text file example.txt 开发者_Go百科that contains: c:/temp/file.txt

I read the file content using this method:

public static String fileToString(final File file, final String charset) throws AppServerException
    {
        final byte[] buffer = new byte[(int) file.length()];
        FileInputStream fileInputStream = null;
        try
        {
            fileInputStream = new FileInputStream(file);
            fileInputStream.read(buffer);
        }
        catch (final FileNotFoundException e)
        {
            throw new AppServerException(e.getMessage());
        }
        catch (final IOException e)
        {
            throw new AppServerException(e.getMessage());
        }
        finally
        {
            FileHelper.close(fileInputStream);
        }

        try
        {
            return new String(buffer,charset);
        }
        catch (UnsupportedEncodingException e)
        {
                throw new AppServerException(e.getMessage());
        }

    }

Then I want to check if the file c:/temp/file.txt exists:

String content = fileToString("example.txt","UTF8");
File file = new File(content );
System.out.println(file.exists());

The exits() return false but the file actually exists.

If I change the encoding of example.txt to ANSI using notepad++, the exists() return true.

I already tried using: "c:\temp\file.txt", "c:\\temp\\file.txt", "c:\\\\temp\\\\file.txt", but without success.

I really need to use the file as UTF8. Do you have tips so the method exists() returns true?


Notepad++ probably puts a Byte Order Mark in front of the file. This is unnecessary for UTF-8 and Java does not interpret this sequence of three characters.

Either use an editor that does not use a Byte Order Mark or write the string in ANSI if your filename does not contain any non-ASCII characters.


Perhaps the file is not actually encoded as UTF-8. Can you print the actual byte values of the "\" characters in the file?

While you are at it: InputStream.read(byte[] b) is not guaranteed to read b.length bytes from the stream. You should be reading in a loop and checking the return value of the read() method in order to see how many bytes were actually read in each call.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号