How to extract information from a table with variable length and/or more than one page?
Top  Previous  Next

Q: How do I extract information from a table with variable length and/or more than one page?

A: This task occurs often, e.g. when you extract data from a database (search engines, flight schedules, news service). Depending on your keyword, you might get zero, one, two or 500 entries back. To extract this data, need to devide your macro into two separate macros, as described here. Your seconds macro only consists of one extraction tag for one line of the table, e.g.:

EXTRACT POS={{mypos}} TYPE=TXT ATTR=<TD<SP>noWrap>*EUR* 

After replaying the first macro, which navigates to the results, you need to loop through them by changing the
POS parameter, i.e. the mypos variable from 1 to 999. This line would be the content of the "macro_extract" macro in the source code sample below. It extracts the EUR price on each line.

Once you get the
#EANF# message back instead of a result, you know that you reached the end of the table on the current page.

If necessary, you can now run another macro to click on a "NEXT" link for another page of results. If no "NEXT" link is found, you know that you are done.



Here is a Visual Basic source code snippet, that shows how the different macros are used:

'Read keywords
For i1 = 1 To 999 '<<<<<<<<< LOOP Keywords until all words processed (=> Read_Line returns "ERROR")

    sKeyword = Read_Line(mFileInput, i1) 'Read keyword from a text file
    If sKeyword = "ERROR" Or Len(sKeyword) < 2 Then
        Exit For
    End If
    iRet = iim1.iimDisplay("Search: " + sKeyword)
    iRet = iim1.iimSet("-var_search", sKeyword)
    iRet = iim1.iimPlay("macro_search
")
    sData = iim1.iimGetLastExtract()

    iRet = iim1.iimDisplay("Extract: " + sKeyword)
    
    For i2 = 1 To 999 '<<<<<<<<<<<<<LOOP "next" links
    'Loop all NEXT links until error => no more NEXT links to process)

        
     For i3 = 1 To 999 '<<<<<<<<<<<<<<< LOOP table lines and extract them

        'Loop the table rows until an error occurs => no more data on this page

            
           DoEvents
            iRet = iim1.iimSet("-var_mypos", CStr(i3))
'           iRet = iim1.iimDisplay(sKeyword + "P:" + CStr(i2) + "L:" + CStr(i3))
            iRet = iim1.iimPlay("macro_extract
")
            sData = iim1.iimGetLastExtract()
            If iRet = 1 and len (sData) > 0 Then
                'Data found, now split it and save it to a file
                sSplit = Split(sData, "[EXTRACT]")
                s0 = sSplit(0)
                If UBound(sSplit) > 0 Then s1 = sSplit(1)
                If UBound(sSplit) > 1 Then s2 = sSplit(2)
                i = InStr(s0, "#EANF#")
                If i <= 0 Then
                    sLine = sKeyword + sSep + CStr(i2) + sSep + CStr(i3) + sSep + s0 + sSep + s1 + sSep + s2
                    Call Write_Line(mFileOutput, sLine)
                Else
                    Exit For 'next page
                End If
            Else
                Exit For 'next page
            End If 'iRet
        Next  'table rows loop
        
        iRet = iim1.iimDisplay(sKeyword + " Page: " + CStr(i2))
        iRet = iim1.iimPlay("macro_next
")
        If iRet < 0 Then
            Exit For 'next keyword
        End If
        
    Next 'NEXT links loop


 Next 'Keyword list loop


iRet = iim1.iimDisplay("Extraction completed")



Note: The support functions used in this example are:

Function Read_Line(sFile, iline) As String
Dim sValue
Dim i As Integer
Dim bFound As Boolean
bFound = False

Open sFile For Input As #1
i = 1
Do While (Not EOF(1) And bFound = False)
    Input #1, sValue
    If i = iline Then bFound = True
    i = i + 1
Loop
Close #1

If bFound = False Then
    sValue = "ERROR"
End If

Read_Line = sValue
End Function

Public Sub Write_Line(sFile, sLine)
   Open sFile For Append As #2
   Print #2, sLine
   Close #2
End Sub




Page URL http://www.iopus.com/imacros/help/faq_extract_pages.htm