Guest post by Kathrina Molera
Past Associate at AMIHAN

It was a few weeks back when our Big Data architect, Ace Subido, and I went for a tech immersion in California to attend the Hadoop Summit of 2016. The community is celebrating 10 years of Hadoop contributions this year. 10 years! Yes, that’s how long Hadoop Big Data technology has been maturing, and from what we’ve seen, it will surely continue to develop and improve through the coming years. The event was attended by the biggest players in the industry – Yahoo, Google, Hortonworks, Microsoft, and Cisco – just to name a few. A number of promising startups were there as well. We were 1 out of 2 companies from the Philippines who attended the event, the other being PLDT’s Talas Data Intelligence, Inc.. It was a proud moment for us to represent the country. To show our peers that we’ve been doing Big Data and Hadoop innovations for some of the largest enterprises in the Philippines, and doing our best to evangelize Hadoop in our side of the world.

Learning about Big Data from the best companies in the world

The three-day event had a fully-packed agenda consisting of keynotes, breakout sessions, and the community showcase. The keynote sessions included impressive demos of the latest innovations in the field, such as how Progressive Insurance Company used Hadoop & analytics to bring down insurance cost for their customers through user segmentation. They utilized a sensor device to track the driving behavior of their customers to determine whether they are a safe or bad driver. Through the data collected and by performing analytics, Progressive was able to adjust the premiums per individual, saving their customers a total of $523M.

Another noteworthy talk was on Google’s answer to the need for an on-demand Big Data & analytics platform. If you’re a company that wants to see if Big Data is right for you, you can easily spin-off machines through Google cloud and start crunching data. You don’t have to spend months setting up your own Big Data infrastructure, it can be available in just minutes. Of course, there are advantages and disadvantages to having your infrastructure on-premise or in the cloud. Big Data experts can help you determine the best setup for your enterprise so you can maximize machine power and optimize costs at the same time.

There was also a panel interview with some of the heavy users of Hadoop, namely Blue Cross, Macy’s and ConocoPhillips, companies that have been highly successful in implementing the technology. During the panel, all the participants acknowledged that platform is the easy part in building a Big Data and analytics practice, and that people and process are the hardest challenges because change management has to be implemented within the organization. Thus, there needs to be a Big Data champion inside the organization who can effectively lead the implementation of this new kind of technology within the company. The panel reiterated that having this infrastructure will provide flexibility for business users to access data and generate insights faster for decision making. They also pointed out how data security and governance should be kept in mind at the beginning; most organizations focus on the technology and how data can be accessed and security and governance are only tackled after users have started working with the infrastructure. This makes it harder to integrate when the infrastructure is already in production as downtime and architecture changes would have to be considered.

Panel interview with some of the users of Hadoop: Blue Cross for health, Macy’s for retail and ConocoPhillips for utility. Everyone agreed that the biggest challenge is people and process.

The breakout sessions were divided into business and technical tracks, satisfying all kinds of audiences. Some of my personal favorites included a talk on Uber’s developer data kit, a session on Microsoft & Google’s Hadoop cloud platforms, and Blue Cross’ showcase of their Hadoop reference architecture. Uber’s data kit allows multiple groups to access customer and trip data. It works like an API, so their developers can just call programming methods every time they create new functionalities for the app. Designing codes like this eliminates the need to rewrite hundred lines of code and keeps the development team lean and efficient. This is also true for Google, who made their APIs for machine learning and image and voice recognition available to the public, making it easier for others to develop new applications that can solve the world’s problems. Microsoft talked about the unreasonable effectiveness of ACID (Analytics, Cloud, IoT, Data) in areas like education, farming, science, transportation and the environment. Genetics studies that used to run for years to capture results can now be completed in days through technology. This is exciting as there is finally have the cure for cancer soon. And with IoT and Big Data, healthcare is now being transformed to ensure proper care is given to patients. Studies on this can then promote more healthy citizens in the future.

Women in Big Data

The event that really struck a chord was a lunch with the group Women in Big Data, during which an interesting question was raised about how to encourage young girls to be more inclined to technology. One of the panelists voiced out that it should start with the parents’ upbringing of the child. The toys, colors, and perspective of a child should not be restricted to what society dictates. I completely agree with this. Other ways I believe we can encourage young women to strive for a career in technology are by setting an example by breaking gender barriers, exposing them to all sorts of technologies at a young age, and empowering them to pursue the track. Developing a community that supports & encourages women to pursue technology, and to foster a mentor-mentee relationship are some of the things that the group aims to achieve.

The latest products showcased

Apart from the sessions, there was also a community showcase area in which all of the partners and sponsors had their own unique ways of catching the participants’ attention. Needless to say there were lots of freebies, and we managed to checkout other product demos and ask questions in detail. One nifty product by Trifacta called Wrangler allows easy cleaning and preparation of data through drag & drop functionalities so data can be visualized smoothly. With this, business users can manipulate and pull appropriate data according to the reports they need to generate. Gone are the days when you need engineers just to prepare the data for consumption. Several hardware vendors are also there showcasing their products such as switches, storage, racks, servers and network devices. Now, you don’t need expensive hardware to deploy Big Data infrastructure, commodity hardware from Edgecore, Quanta and SuperMicro can be used to support complex analytics jobs.

Party time! Great DJ, laser lights, dance music, games and unlimited food and drinks!

The next 10 years

All throughout the event, I kept thinking how overwhelming it was to see how big the community has gotten, and how many of the biggest enterprises in the world now leverage on the technology. It is a confirmation that digital transformation is not just a fad, but is a MUST, and that data will be the most precious asset of any company. It’s exciting to think about what happens next and how technology will shape the world in the next 10 years. I hope to see more companies from the Philippines participating in the next Hadoop Summit events. Let’s show the world what Filipinos are capable of!