Pulling data from social sites
I went through the Introduction(white paper PDF) of HPCC and also through the Sentilyze use-case which uses a csv as input.
The Introduction pdf claims that HPCC can pull data from external web sites.
How can this be achieved for social sites like FB,Twitter etc.
Note: How to specify the record structure for pulling such data?Will NLP support of HPCC be required?
Thanks and regards !
The Introduction pdf claims that HPCC can pull data from external web sites.
How can this be achieved for social sites like FB,Twitter etc.
Note: How to specify the record structure for pulling such data?Will NLP support of HPCC be required?
Thanks and regards !
- kaliyugantagonist
- Posts: 43
- Joined: Mon Jul 23, 2012 11:23 am
One of our developers has used the Twitter API (https://dev.twitter.com/docs/api) to collect tweets.
1/ a linux app to harvest tweets
2/ a javascript app to do selective tweet gets
3/ a linux app callable from ECL using PIPE to do selective tweet gets
She basically used 2 different approaches, but since the Twitter API has been evolving, I am not sure whether both are still available/supported.
1/ She was repeatedly calling the API to get all the tweets. We needed that to create a reasonable training set for our ML classifier
2/ She calls the API a few times, passing in a specific term to filter the tweets by. We needed that to get a set of tweets associated with specific topic
In both cases, the app has to be written to keep calling the twitter api, and the code should be written in such a way not to get the twitter service upset
Regards,
Bob
1/ a linux app to harvest tweets
2/ a javascript app to do selective tweet gets
3/ a linux app callable from ECL using PIPE to do selective tweet gets
She basically used 2 different approaches, but since the Twitter API has been evolving, I am not sure whether both are still available/supported.
1/ She was repeatedly calling the API to get all the tweets. We needed that to create a reasonable training set for our ML classifier
2/ She calls the API a few times, passing in a specific term to filter the tweets by. We needed that to get a set of tweets associated with specific topic
In both cases, the app has to be written to keep calling the twitter api, and the code should be written in such a way not to get the twitter service upset

Regards,
Bob
- bforeman
- Community Advisory Board Member
- Posts: 1006
- Joined: Wed Jun 29, 2011 7:13 pm
Hi bob,
Thanks for the reply
I'm totally clueless about the ECL PIPE which is faintly mentioned in the HPCC Introduction documentation - please refer the below question.
http://hpccsystems.com/bb/viewtopic.php?f=10&t=722&sid=22459bc0c057c631e3d7fc685ffe6fa3
Where do I get hold of ECL PIPE - documentation, examples etc. ?
Thanks and regards !
Thanks for the reply

I'm totally clueless about the ECL PIPE which is faintly mentioned in the HPCC Introduction documentation - please refer the below question.
http://hpccsystems.com/bb/viewtopic.php?f=10&t=722&sid=22459bc0c057c631e3d7fc685ffe6fa3
Where do I get hold of ECL PIPE - documentation, examples etc. ?
Thanks and regards !
- kaliyugantagonist
- Posts: 43
- Joined: Mon Jul 23, 2012 11:23 am
The PIPE ECL function is located in the Language Reference Manual.
You can also just type in the word "PIPE" in any ECL file in the ECL IDE, and press the F1 key.
I will have a look at your other post.
Regards,
Bob
You can also just type in the word "PIPE" in any ECL file in the ECL IDE, and press the F1 key.
I will have a look at your other post.
Regards,
Bob
- bforeman
- Community Advisory Board Member
- Posts: 1006
- Joined: Wed Jun 29, 2011 7:13 pm
4 posts
• Page 1 of 1
Who is online
Users browsing this forum: No registered users and 1 guest