問題描述
VB.net 如何讓流閱讀器忽略某些行? (VB.net how to make stream reader ignore some line?)
i am using a stream reader to get the HTML of some page, but there are lines that i want to ignore, such as if a line starts with <span>
any advice? Here is my function
Public Function GetPageHTMLReaderNoPrx(ByVal address As Uri) As StreamReader
Dim request As HttpWebRequest
Dim response As HttpWebResponse = Nothing
Dim reader As StreamReader
Try
request = DirectCast(WebRequest.Create(address), HttpWebRequest)
response = DirectCast(request.GetResponse(), HttpWebResponse)
Select Case CType(response, Net.HttpWebResponse).StatusCode
Case 200
reader = New StreamReader(response.GetResponseStream(), Encoding.Default)
Case Else
MsgBox(CType(response, Net.HttpWebResponse).StatusCode)
End Select
Catch
If Not response Is Nothing Then response.Close()
End Try
Return reader
End Function
this is how the HTML looks like
<tr>Text
<span>show all</span>
</tr>
‑‑‑‑‑
參考解法
方法 1:
If you insist on using strings, you could do something like this:
<pre class="lang‑vb prettyprint‑override">Do Dim line As String = reader.ReadLine() If line Is Nothing Then Exit Do 'end of stream If line.StarsWith("<span>") Then Exit Do 'ignore this line 'otherwise do some processing here '... Loop
</pre>
But this approach is not stable ‑ any minor change in the input HTML can break your flow.
More elegant solution would be using XElement
:
Dim xml = <tr>Text
<span>show all</span>
</tr>
xml.<span>.Remove()
MsgBox(xml.Value.Trim)
(by user1570048、Victor Zakharov)