开发者

Problem with extracting values from xml file using java and regex

开发者 https://www.devze.com 2023-03-30 05:38 出处:网络
I have a file with the following contents <div name=\"hello\"></div> and i need a java code that will read this file and print only the word *hello

I have a file with the following contents

<div name="hello"></div>

and i need a java code that will read this file and print only the word *hello

This is what i have come up with

while (( line = bf.readLine()) != null)  

             {                     
                 linecount++;  

                int indexfound = line.indexOf("<div name");  

                 if (indexfound > -1) {
                  Pattern p = Pattern.compile("\"([^\"]*)\""); 
                    Matcher m = p.matcher(line); 
                    while (m.find()) {   System.out.println(m.group(1)); } 
                                 }
 }  



        bf.close(); 
}} catch (IOException e) {
        e.printStackTrace();
}}}

but the problem with this code is t开发者_JAVA技巧hat if i make changes to the file such that it looks this way

<div name="hello" value="hi"></div>

then hi also gets printed but i want only hello to be printed


While the best answer to questions like this is to advocate the use of an HTML or XML parser to extract attributes, it's worthwhile pointing out the issue in your question.

You are getting both attributes printed because you are printing inside a while loop. You are printing everything surrounded by double quotes.

Furthermore, you only want the value of the name attribute. So your pattern should be formed as follows:

Pattern.compile("name=\"([^\"]*)\"");


You can use any of the DOM libraries available in java such as jDOM or Dom4j. The file you are trying to parse is an xml (HTML) file, these DOM libraries are developed to parse such xml files. Its easy to get started. Follow the tutorials on this site. http://www.java-samples.com/showtutorial.php?tutorialid=152


Your code might work for the change you have made in the XML however you may need changes in your code with every other change in your XML. This can be exhausting and hence I suggest the best way to read an XML doc in Java is to use parsers. In Java there are two parsers I have come across recently: DOM and SAX. You should find a lot of tutorials and examples on the internet; these were where I learned a lot: http://www.mkyong.com/java/how-to-read-xml-file-in-java-sax-parser/ and http://www.mkyong.com/java/how-to-read-xml-file-in-java-dom-parser/

0

精彩评论

暂无评论...
验证码 换一张
取 消