Programming & IT Tricks . Theme images by MichaelJay. Powered by Blogger.

Copyright

Facebook

Post Top Ad

Search This Blog

Post Top Ad

Responsive Ads Here

Archive

Post Top Ad

Contact


Editors Picks

Follow us

Post Top Ad

Fashion

Music

News

Sports

Food

Technology

Featured

Videos

Fashion

Technology

Fashion

Label

Translate

About

Translate

Sponsor

test

Weekly

Comments

Recent

Connect With us

Over 600,000+ Readers Get fresh content from FastBlog

About

Showing posts with label Artificial Intelligence. Show all posts
Showing posts with label Artificial Intelligence. Show all posts

Thursday, October 11, 2018

Samsung Galaxy A9 (2018), the world's first smartphone with 4-inch rear camera launched


Samsung has launched the world's first-ever 4-rear camera smartphone Samsung Galaxy A9 (2018). The phone was launched at an event held in Malaysia's Kuala Lumpur on Thursday. The biggest feature of the Samsung Galaxy A9 (2018) is the launch of the Samsung Galaxy A9 (2018) 4 rear camera and has become the world's first smartphone with 4-rear camera. Let's tell you that the Galaxy A7 was launched with three rear cameras.
Samsung Galaxy A9 (2018) specification

This phone has the Android Orio 8.1 and 6.3-inch Full HD Plus Super Amoled display with dual SIM support. Apart from this, the phone will have Qualcomm Snapdragon 660 processor, up to 8 GB of RAM and 128 GB of storage.

Samsung Galaxy A9 is a 4-rear camera in 2019 with a 24-megapixel main lens, the second lens is a 10-megapixel telephoto with 2x optical zoom. The third lens is a 8-megapixel ultra wide angle lens and a fourth 5-megapixel lens. The four cameras are from the top down from the same line. The front has a 24-megapixel camera.

Samsung Galaxy A9 (2018) has a 3800 mAh battery that supports fast charging. There will be a fingerprint sensor in the phone's power button.
Price of Samsung Galaxy A9 (2018)

The price of Samsung Galaxy A9 (2018) is 599 euros, which is approximately Rs 51,300. However, there is still no explanation about how much Samsung Galaxy A9 (2018) will be worth in India. This phone will be available in Bubblegum Pink, Caver Black and Lemonade Blue Color Variants.

Sunday, February 18, 2018

Why Google Assistant is the Future of Ai ?



In October 2017, Google CEO came out with a pretty strong Ai focused statement and said that “Google is now an Ai first company”. This announcement made clear that world’s leader in internet search engines was truly embracing the Ai revolution and that it will make establish itself as a first class leader in this technology. The acquisition of the most advanced artificial intelligence platform, API.AI, now known as Dialogflow was the first stepping stone to get from a mobile first company to an Ai first company.
Social Media Versus Digital Devices Usage
Lately, I asked the following serie of questions to every friends and family members I have:
Carl: “How much time you spend on Facebook each day ❓”
Them: “IDK 5 to 30 minutes per day, why ❓” 🤔
Carl: “You’ll see…Now how much time do you spend doing the following things:
- Interacting with a mobile device 📱
- Watching tv 📺
- Streaming music or videos on an app 🎵
- Driving your car 🚗
- Browsing on internet 💻
- Searching an answer to a question you have by googling it
Them: “Pretty much half of my day…why❓” 🤨
The time you spend using your car, browsing on the internet, watching Netflix, driving your car, interacting with your phone is much greater than the time you spend on all of your social media platforms. This mean that the opportunity Google Assistant has to capture your attention is more likely to retarget you then any other Ai platform.
In the future you will not only ask voice recognition questions to the Google Assistant with your mobile phone or with Google Home devices, you will also do the same with your car’s Internet of Things (IoT) device and with any other IoT gears such as mirrors, smart tvs, your fridge and so on.
Some Social Media are Getting Old 👴🏻
The first social media I used was Mirc, I made a lot of great friends on this social media, we would do Get Together at punk bands shows in my Quebec home state. Then came MSN Messenger, and then came the Google search engine which my father introduced to me first and then came Facebook. When I made my Facebook profile I remember that I was the only one I knew that had a Facebook profile, more than 10 years ago.
Facebook got to what we called the maturity stage in technology. It’s nothing against it, but everything in life, including corporations, get old one day. Some companies grow older quicker, some companies seems to have take a sip of the golden grail from Indiana Jones third movie with eternal life.

In my personal opinion, Google seem much younger now than its main rival and to be honest the corporation owned by Alphabet has been much more transparent in its course of action toward artificial intelligence, to say the least, than its main competitor.
Also some of the youngest players in the Ai world are getting bright future in the Ai ecosystem,, according to a report by Global Web Index Telegram, BBM and WeChat are respectively the front runners social media with 89%, 81% and 81% active users interested in money transfer features on mobile . Money transfer is crucial for any Ai platform in order to make a interesting return on investment (ROI) for its investors.

YouTube Star Amongst the Most Wanted Career for Primary School Students
When I was 12 years old my dream was to become the next Kurt Kobain, I learned to play bass guitar and my punk band was named No Way Out. Nowadays kids want to be the next PewDiePie and become a YouTube star.

A couple weeks ago when I was at my parents’ house a news report mentioned that becoming a YouTube star was now one of the top 10 most wanted job for children in primary school. Why does that have anything to do with artificial intelligence?
Because technology changes have always been driven by the kids, notably because they have the biggest social channels and because they also want to affirm themselves by being different from the older generations.
Which mean that if kids want to become YouTube stars they are more than likely going to interact with Google Ai than any other digital giants wanting to get a piece of the robots age revolution.
Google Assistant Will Make Ai Friendly
I don’t know for you, but I am getting bored of texting. It is slow, inefficient and it lacks emotional warmth of a real conversation. Google Assistant not only allow you to create voice recognition Ai, it allows to be personalized the voice by selecting a male or a female voice, slowing down the pace of the Ai voice and this is only the beginning of it.
In a couple years from now kids will interact with Ai robots like they now do with their cat and their dog:
They will be active members of their family.
This prediction of the future might seem crazy to most of you right now, but my prediction about cryptocurrencies that would become the next mean of economical trade also seemed crazy 4 years ago to all my peers. At that time, Bitcoins were being traded at values of in between $200 to $800 dollars.
Will I be right again? Only time will tell.


Tuesday, January 23, 2018

AI Innovation: Security and Privacy Challenges


To anyone working in technology (or, really, anyone on the Internet), the term “AI” is everywhere. Artificial intelligence — technically, machine learning — is finding application in virtually every industry on the planet, from medicine and finance to entertainment and law enforcement. As the Internet of Things (IoT) continues to expand, and the potential for blockchain becomes more widely realized, ML growth will occur through these areas as well.
While current technical constraints limit these models from reaching “general intelligence” capability, organizations continue to push the bounds of ML’s domain-specific applications, such as image recognition and natural language processing. Modern computing power (GPUs in particular) has contributed greatly to these recent developments — which is why it’s also worth noting that quantum computing will exponentialize this progress over the next several years.
Alongside enormous growth in this space, however, has been increased criticism; from conflating AI with machine learning to relying on those very buzzwords to attract large investments, many “innovators” in this space have drawn criticism from technologists as to the legitimacy of their contributions. Thankfully, there’s plenty of room — and, by extension, overlooked profit — for innovation with ML’s security and privacy challenges.

Reverse-Engineering

Machine learning models, much like any piece of software, are prone to theft and subsequent reverse-engineering. In late 2016, researchers at Cornell Tech, the Swiss Institute EPFL, and the University of North Carolina reverse-engineered a sophisticated Amazon AI by analyzing its responses to only a few thousand queries; their clone replicated the original model’s output with nearly perfect accuracy. The process is not difficult to execute, and once completed, hackers will have effectively “copied” the entire machine learning algorithm — which its creators presumably spent generously to develop.
The risk this poses will only continue to grow. In addition to the potentially massive financial costs of intellectual property theft, this vulnerability also poses threats to national security — especially as governments pour billions of dollars into autonomous weapon research.
While some researchers have suggested that increased model complexity is the best solution, there hasn’t been nearly enough open work done in this space; it’s a critical (albeit underpublicized) opportunity for innovation — all in defense of the multi-billion-dollar AI sector.

Adversarial “Injection”

Machine learning also faces the risk of adversarial “injection” — sending malicious data that disrupts a neural network’s functionality. Last year, for instance, researchers from four top universities confused image recognition systems by adding small stickers onto a photo, through what they termed Robust Physical Perturbation (RP2) attacks; the networks in question then misclassified the image. Another team at NYU showed a similar attack against a facial recognition system, which would allow a suspect individual to easily escape detection.
Not only is this attack a threat to the network itself (i.e. consider this against a self-driving car), but it’s also a threat to companies who outsource their AI development and risk contractors putting their own “backdoors” into the system. Jaime Blasco, Chief Scientist at security company AlienVault, points out that this risk will only increase as the world depends more and more on machine learning. What would happen, for instance, if these flaws persisted in military systems? Law enforcement cameras? Surgical robots?

Training Data Privacy

Protecting the training data put into machine learning models is yet another area that needs innovation. Currently, hackers can reverse-engineer user data out of machine learning models with relative ease. Since the bulk of a model’s training data is often personally identifiable information —e.g. with medicine and finance — this means anyone from an organized crime group to a business competitor can reap economic reward from such attacks.
As machine learning models move to the cloud (i.e. self-driving cars), this becomes even more complicated; at the same that users need to privately and securely send their data to the central network, the network needs to make sure it can trust the user’s data (so tokenizing the data via hashing, for instance, isn’t necessarily an option). We can once again abstract this challenge with everything from mobile phones to weapons systems.
Further, as organizations seek personal data for ML research, their clients might want to contribute to the work (e.g. improving cancer detection) without compromising their privacy (e.g. providing an excess of PII that just sits in a database). These two interests currently seem at odds — but they also aren’t receiving much focus, so we shouldn’t see this opposition as inherent. Smart redesign could easily mitigate these problems.

Conclusion

In short: it’s time some innovators in the AI space focused on its security and privacy issues. With the world increasingly dependent on these algorithms, there’s simply too much at stake — including a lot of money for those who address these challenges.

Simple App Ideas: How to Find the Next Big Thing


Originally published on http://www.appsterhq.com/
When it comes to building mobile apps, app makers tend to overcomplicate their ideas and strategies.
The app winds up becoming a clunky Swiss Army knife — one that offers too many features, is difficult to learn and use, and costly to maintain.
But when we think about successful apps, it’s often the simplest ones that come to mind — apps like Dropbox and Evernote that address a pressing pain point, yet are effortlessly easy to use.
As Steve Jobs famously said:
“Simple can be harder than complex: you have to work hard to get your thinking clean to make it simple. But it’s worth it in the end because once you get there, you can move mountains.”
Below, I’ll share about tried-and-tested strategies that I’ve used to help startups and entrepreneurs at Appster come up with simple app ideas effectively.

1. Train yourself to become an idea machine:

How do you get better at coming up with app ideas?
The answer, according to entrepreneur, best-selling author and podcaster James Altucher is to become an idea machine.
In Altucher’s words, it’s akin to being a superhero, where you’re never at a loss for ideas — regardless of whichever situation you’re in or whatever questions you’re thrown at.
If this sounds unrealistic, it isn’t — but it does require plenty of practice. Here’s a quick roundup of Altucher’s tips for becoming an idea machine:

How many ideas should you come up with each time?

10.

Why 10 ideas?

Most of us wouldn’t have much difficulty with conceiving a handful of ideas, even if it’s centered around topics that we don’t usually ponder about.
But after the fifth idea is just about when it gets challenging — where we find ourselves at a loss for ideas.
The point of the exercise is to break through this stage in order to build up your idea muscle.

How can you assess your ideas?

You can’t, and evaluating your app ideas isn’t the priority at this point in time. Says Altucher:
“You have to try multiple ideas and see which ones gets the excitement of customers, employees, and you can see that people are legitimately using it and excited by it.”

What are topics you can start brainstorming on?

The key here is to have fun with the ideation process, so don’t limit yourself to business-related ideas.
It helps to think out of the box, and conceptualize ideas around topics like “10 ridiculous mobile apps I would want”, “10 ways an app can solve a problem that I’m facing”, “10 mobile apps that I can reinvent” or “10 mobile apps I would improve”.

How long does it take to become an “idea machine”?

Altucher suggests doing this daily for at least six months.

2. Hone your problem-solving skills

How can app makers go about finding the right idea for their startup?
Daniel Kempe, founder of hand-curated content suggestion platform Quuu elaborates in a Forbes article:
“It’s not about the search for ideas, it’s identifying problems or gaps with existing products or services. Ideas are tough to come by, at least good ones are. But problems, they’re everywhere! Almost every product or service you use on a daily basis was created to solve a problem.”
Here four strategies you can use to hone your problem-solving abilities:

2.1. Review problems you face on a day-to-day basis

The first place to start is by identifying problems that you encounter on a day-to-day basis.
It can be difficult to come up with ideas in a brainstorming session, so start by observing any moments of frustration you have throughout the day — whether it’s an interruption or delay that occurred at work or a problem that crops up in your personal life.
At first glance, these issues may appear to be minute or mundane — but resolving a personal problem has been the launching point for apps like Dropbox and Summly.
The idea for Dropbox arose out of co-founder Drew Houston’s frustration with the absence of a seamless storage solution for his files, while Summly was conceived when founder Nick D’Aloisio found it inefficient to click through Google search results while preparing for his exams.

2.2. Pay attention to everyday conversations

Everyday conversations and seemingly banal complains can become a source of inspiration.
Try carving out a block of time — say five days to a week — where you pay close attention to and note down problems that crop up in conversations all around you. “I wish this could be better”, “I hate this…”, “Why does this keep happening…” are some phrases you’d want to prick your ears up for.

2.3. Tap on social media

Social media platforms are a great way to find out about larger scale problems encountered by individuals and communities all around the world.
There are different ways of conducting a search via social media: you may create hashtags around the problems you’ve picked up on through the above methods, or come up with a list of hashtags around topics or problems that you’re concerned about.

2.4. Go to where your potential users are at

Blog articles, blog comments, forums, discussion boards, Quora — apart from being a useful source of information, these sites are also a great way to interact with potential users or target audience.
Take note of common problems and issues that are being discussed, and don’t hesitate to post comments or start a new thread around any questions you might have — this could spark off conversations that will shed light on problems that you weren’t previously aware of.

3. Keep a close watch on your competitors

A competitive analysis should be carried out at several stages over the lifetime of a mobile app: during the ideation process, before significant changes are made to your app or business strategy, and at regular intervals to keep up with changes in the competitive landscape.
Below, I’ll be focusing on competitive analysis conducted at the ideation stage.

Pay attention to user feedback and comments

User feedback and comments are a treasure trove of information. Start poring through reviews and ratings left by users of your competitors’ apps — from app store reviews to social media comments and forum posts — to obtain a clearer idea of features and strategies that resonate with your potential users.
If you’re in the midst of sounding out your ideas with friends and family or testing your MVP, keep an eye out for remarks like “I’ve tried out the ABC app, but didn’t like a particular feature they had” or “This feature reminds me of XYZ app” — you’ll know that these are competitors to keep track of.

Dig deep into your competitors’ strategies

By delving deep into the strategies implemented by your competitors, you can then break these down into simpler elements, and reverse engineer the processes to replicate their success.
The objective here isn’t to imitate what your competitors are doing, but to combine their strategies with your existing ideas to create concepts and features that work for your app.
Here’s a checklist of questions to help you get started on your research process:
  • Which strategies have produced the best results for your competitors?
  • What were unsuccessful strategies implemented?
  • How can you improve on strategies your competitors implemented?
  • How can you adapt these strategies to make it work for your mobile app?
  • Don’t forget about indirect competitors
While your indirect competitors may not have launched a mobile app, they are still targeting a similar set of users — so it helps to pay attention to how they’re attracting your potential users with their products or services.
Here are key questions to guide you in your analysis of indirect competitors:
  • In what areas are their products or services similar to yours?
  • What are successful strategies and ideas that have helped them target and retain their users?
  • How can these strategies be improved on?
  • Can you adapt these ideas or concepts to make it work for your mobile app?

4. Stay on top of the latest trends

The ever-changing mobile landscape is a challenging space to navigate.
App makers are up against the intense competition — a 2017 Statista study indicated that Android users were able to choose between 2.8 million apps, while the number of apps on the App Store totaled at 2.2 million.
In addition, the emergence of trends like augmented reality, virtual reality and chatbots are revolutionizing the way users engage with mobile apps.
Strategies and features that are effective now may easily be rendered irrelevant in a matter of months. Generating ideas that resonate with today’s users requires a constant pursuit of keeping up with the trends.
Here are a few tools and websites you can use to stay on top of the latest developments:
  • Google Trends
  • Google Alerts
  • App Annie: App market data and insights company producing consumer and competitive information on downloads, revenue, ratings, usage, search terms and more. App Annie’s Insights Blog and webinars are also great resources for app makers.
  • Priori Data: App Store intelligence company providing market data and competitive benchmarking information on the global app economy.
  • Forrester Research: Market research firm providing advice on existing and potential impacts of technology.
  • Trendwatching: Independent trend firm scanning the global market for promising consumer trends and insights.
  • Springwise: Provides information on innovation intelligence. Springwise sources for the latest innovation, startup, and business ideas from around the world.
  • Trends and mobile apps outside of your industry
Too often, startups and businesses fall into the trap of living within the industry bubble. In adopting a myopic focus on industry trends, benchmarking and best practices, companies eventually wind up providing run-of-the-mill experiences that fail to stand out.
This can be prevented by studying and introducing ideas and concepts from industries, businesses or mobile apps that differ from your own.
Here are key takeaways you can gain from studying mobile apps across different industries:
  • Zappos: Zappos is known for delivering stellar customer experiences, and its mobile app is no different. App makers can learn about providing top-notch experiences through studying features like Ask Zappos, a feature that helps users find any product with just a tap of their camera, and Handover, which enables users to shop seamlessly between their Apple devices.
  • JetBlue: Pesky push notifications are a bane for smartphone users. Learn from JetBlue’s timely and thoughtful communication, which includes providing flight check-in reminders 24 hours before a flight is scheduled to take off, as well as notifications to let passengers review flight entertainment options in advance.
  • Venmo: App makers can learn from the convenience and efficiency that mobile payment apps like Venmo provides — from the way user information is saved for easy access, to how a complex process like sending out money or making purchases can be completed in a few quick taps.

Thursday, January 18, 2018

The 8 Neural Network Architectures Machine

 

 Learning Researchers Need to Learn

The Historical Development of Machine Learning’s Core Structure

Why do we need Machine Learning?

Machine learning is needed for tasks that are too complex for humans to code directly. Some tasks are so complex that it is impractical, if not impossible, for humans to work out all of the nuances and code for them explicitly. So instead, we provide a large amount of data to a machine learning algorithm and let the algorithm work it out by exploring that data and searching for a model that will achieve what the programmers have set it out to achieve.
Let’s look at these 2 examples:
  • It is very hard to write programs that solve problems like recognizing a 3-dimensional object from a novel viewpoint in new lighting conditions in a cluttered scene. We don’t know what program to write because we don’t know how it’s done in our brain. Even if we had a good idea about how to do it, the program might be horrendously complicated.
  • It is hard to write a program to compute the probability that a credit card transaction is fraudulent. There may not be any rules that are both simple and reliable. We need to combine a very large number of weak rules. Fraud is a moving target but the program needs to keep changing.
Then comes the Machine Learning Approach: Instead of writing a program by hand for each specific task, we collect lots of examples that specify the correct output for a given input. A machine learning algorithm then takes these examples and produces a program that does the job. The program produced by the learning algorithm may look very different from a typical hand-written program. It may contain millions of numbers. If we do it right, the program works for new cases as well as the ones we trained it on. If the data changes the program can change too by training on the new data. You should note that massive amounts of computation are now cheaper than paying someone to write a task-specific program.
Given that, some examples of tasks best solved by machine learning include:
  • Recognizing patterns: Objects in real scenes, Facial identities or facial expressions, Spoken words
  • Recognizing anomalies: Unusual sequences of credit card transactions, Unusual patterns of sensor readings in a nuclear power plant
  • Prediction: Future stock prices or currency exchange rates, Which movies will a person like

What are Neural Networks?

Neural networks are a class of models within the general machine learning literature. So for example, if you took a Coursera course on machine learning, neural networks will likely be covered. Neural networks are a specific set of algorithms that has revolutionized the field of machine learning. They are inspired by biological neural networks and the current so called deep neural networks have proven to work quite very well. Neural Networks are themselves general function approximations, that is why they can be applied to literally almost any machine learning problem where the problem is about learning a complex mapping from the input to the output space.
Here are the 3 reasons to convince you to study neural computation:
  • To understand how the brain actually works: It’s very big and very complicated and made of stuff that dies when you poke it around. So we need to use computer simulations.
  • To understand a style of parallel computation inspired by neurons and their adaptive connections: It’s a very different style from a sequential computation.
  • To solve practical problems by using novel learning algorithms inspired by the brain: Learning algorithms can be very useful even if they are not how the brain actually works.
After finishing the famous Andrew Ng’s Machine Learning Coursera course, I started developing interest towards neural networks and deep learning. Thus, I started looking at the best online resources to learn about the topics and found Geoffrey Hinton’s Neural Networks for Machine Learning course. If you are a deep learning practitioner or someone who want to get into the deep learning/machine learning world, you should really take this course. Geoffrey Hinton is without a doubt a godfather of the deep learning world. And he actually provided something extraordinary in this course. In this blog post, I want to share the 8 neural network architectures from the course that I believe any machine learning researchers should be familiar with to advance their work.
Generally, these architectures can be put into 3 specific categories:
1 — Feed-Forward Neural Networks
These are the commonest type of neural network in practical applications. The first layer is the input and the last layer is the output. If there is more than one hidden layer, we call them “deep” neural networks. They compute a series of transformations that change the similarities between cases. The activities of the neurons in each layer are a non-linear function of the activities in the layer below.
2 — Recurrent Networks
These have directed cycles in their connection graph. That means you can sometimes get back to where you started by following the arrows. They can have complicated dynamics and this can make them very difficult to train. They are more biologically realistic.
There is a lot of interest at present in finding efficient ways of training recurrent nets. Recurrent neural networks are a very natural way to model sequential data. They are equivalent to very deep nets with one hidden layer per time slice; except that they use the same weights at every time slice and they get input at every time slice. They have the ability to remember information in their hidden state for a long time but is very hard to train them to use this potential.
3 — Symmetrically Connected Networks
These are like recurrent networks, but the connections between units are symmetrical (they have the same weight in both directions). Symmetric networks are much easier to analyze than recurrent networks. They are also more restricted in what they can do because they obey an energy function. Symmetrically connected nets without hidden units are called “Hopfield Nets.” Symmetrically connected network with hidden units are called “Boltzmann machines.”

1 — Perceptrons

Considered the first generation of neural networks, perceptrons are simply computational models of a single neuron. They were popularized by Frank Rosenblatt in the early 1960s. They appeared to have a very powerful learning algorithm and lots of grand claims were made for what they could learn to do. In 1969, Minsky and Papers published a book called “Perceptrons” that analyzed what they could do and showed their limitations. Many people thought these limitations applied to all neural network models. However, the perceptron learning procedure is still widely used today for tasks with enormous feature vectors that contain many millions of features.
In the standard paradigm for statistical pattern recognition, we first convert the raw input vector into a vector of feature activations. We then use hand-written programs based on common-sense to define the features. Next, we learn how to weight each of the feature activations to get a single scalar quantity. If this quantity is above some threshold, we decide that the input vector is a positive example of the target class.
The standard Perceptron architecture follows the feed-forward model, meaning inputs are sent into the neuron, are processed, and result in an output. In the diagram below, this means the network reads bottom-up: input comes in from the bottom and output goes out from the top.
However, Perceptrons do have limitations: If you are followed to choose the features by hand and if you use enough features, you can do almost anything. For binary input vectors, we can have a separate feature unit for each of the exponentially many binary vectors and so we can make any possible discrimination on binary input vectors. But once the hand-coded features have been determined, there are very strong limitations on what a perceptron can learn.
This result is devastating for Perceptrons because the whole point of pattern recognition is to recognize patterns despite transformations like translation. Minsky and Papert’s “Group Invariance Theorem” says that the part of a Perceptron that learns cannot learn to do this if the transformations form a group. To deal with such transformations, a Perceptron needs to use multiple feature units to recognize transformations of informative sub-patterns. So the tricky part of pattern recognition must be solved by the hand-coded feature detectors, not the learning procedure.
Networks without hidden units are very limited in the input-output mappings they can learn to model. More layers of linear units do not help. It’s still linear. Fixed output non-linearities are not enough. Thus, we need multiple layers of adaptive, non-linear hidden units. But how we train such nets? We need an efficient way of adapting all the weights, not just the last layer. This is hard. Learning the weights going into hidden units is equivalent to learning features. This is difficult because nobody is telling us directly what the hidden units should do.

2 — Convolutional Neural Networks

Machine Learning research has focused extensively on object detection problems over the time. There are various things that make it hard to recognize objects:
  • Segmentation: Real scenes are cluttered with other objects. It’s hard to tell which pieces go together as parts of the same object. Parts of an object can be hidden behind other objects.
  • Lighting: The intensities of the pixels are determined as much by the lighting as by the objects.
  • Deformation: Objects can deform in a variety of non-affine ways. E.g., a handwritten too can have a large loop or just a cusp.
  • Affordances: Object classes are often defined by how they are used. E.g., chairs are things designed for sitting on so they have a wide variety of physical shapes.
  • Viewpoint: Changes in viewpoint cause changes in images that standard learning methods cannot cope with. Information hops between input dimensions (i.e. pixels)
  • Imagine a medical database in which the age of a patient sometimes hopes to the input dimension that normally codes for weight! To apply machine learning we would first want to eliminate this dimension-hopping.
The replicated feature approach is currently the dominant approach for neural networks to solve object detection problem. It uses many different copies of the same feature detector with different positions. It could also replicate across scale and orientation, which is tricky and expensive. Replication greatly reduces the number of free parameters to be learned. It uses several different feature types, each with its own map of replicated detectors. It also allows each patch of image to be represented in several ways.
So what does replicating the feature detectors achieve?
  • Equivalent activities: Replicated features do not make the neural activities invariant to translation. The activities of are equivariant.
  • Invariant knowledge: If a feature is useful in some locations during training, detectors for that feature will be available in all locations during testing.
In 1998, Yann LeCun and his collaborators developed a really good recognizer for handwritten digits called LeNet. It used back propagation in a feedforward net with many hidden layers, many maps of replicated units in each layer, pooling of the outputs of nearby replicated units, a wide net that can cope with several characters at once even if they overlap, and a clever way of training a complete system, not just a recognizer. Later it is formalized under the name convolutional neural networks. Fun fact: This net was used for reading ~10% of the checks in North America.
Convolutional Neural Networks can be used for all work related to object recognition from hand-written digits to 3D objects. However, recognizing real objects in color photographs downloaded from the web is much more complicated than recognizing hand-written digits. There are hundred times as many classes (1000 vs 10), hundred times as many pixels (256 x 256 color vs 28 x 28 gray), two-dimensional images of three-dimensional scenes, cluttered scenes requiring segmentation, and multiple objects in each image. Will the same type of convolutional neural network work?
Then came the ILSVRC-2012 competition on ImageNet, a dataset with approximately 1.2 million high-resolution training images. Test images will be presented with no initial annotation (no segmentation or labels) and algorithms will have to produce labelings specifying what objects are present in the images. Some of the best existing computer vision methods were tried on this dataset by leading computer vision groups from Oxford, INRIA, XRCE… Typically, computer vision systems use complicated multi-stage systems and the early stages are typically hand-tuned by optimizing a few parameters.
The winner of the competition, Alex Krizhevsky (NIPS 2012), developed a very deep convolutional neural net of the type pioneered by Yann LeCun. Its architecture includes 7 hidden layers not counting some max-pooling layers. The early layers were convolutional, while the last 2 layers were globally connected. The activation functions were rectified linear units in every hidden layer. These train much faster and are more expressive than logistic units. In addition to that, it also uses competitive normalization to suppress hidden activities when nearby units have stronger activities. This helps with variations in intensity.
There are a couple of technical tricks that significantly improve generalization for the neural net:
  1. Training on random 224 x 224 patches from the 256 x 256 images to get more data and using left-right reflections of the images. At test time, combining the opinions from 10 different patches: The four 224 x 224 corner patches plus the central 224 x 224 patch plus the reflections of those 5 patches.
  2. Using “dropout” to regularize the weights in the globally connected layers (which contain most of the parameters). Dropout means that half of the hidden units in a layer are randomly removed for each training example. This stops hidden units from relying too much on other hidden units.
In terms of hardware requirement, Alex uses a very efficient implementation of convolutional nets on 2 Nvidia GTX 580 GPUs (over 1000 fast little cores). The GPUs are very good for matrix-matrix multiplies and also have very high bandwidth to memory. This allows him to train the network in a week and makes it quick to combine results from 10 patches at test time. We can spread a network over many cores if we can communicate the states fast enough. As cores get cheaper and datasets get bigger, big neural nets will improve faster than old-fashioned computer vision systems.

3 — Recurrent Neural Network

To understand RNNs, we need to have a brief overview of sequence modeling. When applying machine learning to sequences, we often want to turn an input sequence into an output sequence that lives in a different domain; for example, turn a sequence of sound pressures into a sequence of word identities. When there is no separate target sequence, we can get a teaching signal by trying to predict the next term in the input sequence. The target output sequence is the input sequence with an advance of 1 step. This seems much more natural than trying to predict one pixel in an image from the other pixels, or one patch of an image from the rest of the image. Predicting the next term in a sequence blurs the distinction between supervised and unsupervised learning. It uses methods designed for supervised learning, but it doesn’t require a separate teaching signal.
Memoryless models are the standard approach to this task. In particular, autoregressive models can predict the next term in a sequence from a fixed number of previous terms using “delay taps; and feed-forward neural nets are generalized autoregressive models that use one or more layers of non-linear hidden units. However, if we give our generative model some hidden state, and if we give this hidden state its own internal dynamics, we get a much more interesting kind of model: It can store information in its hidden state for a long time. If the dynamics are noisy and the way they generate outputs from their hidden state is noisy, we can never know its exact hidden state. The best we can do is to infer a probability distribution over the space of hidden state vectors. This inference is only tractable for 2 types of hidden state model.
Recurrent Neural Networks are very powerful, because they combine 2 properties: 1) distributed hidden state that allows them to store a lot of information about the past efficiently, and 2) non-linear dynamics that allow them to update their hidden state in complicated ways. With enough neurons and time, RNNs can compute anything that can be computed by your computer. So what kinds of behavior can RNNs exhibit? They can oscillate, they can settle to point attractors, they can behave chaotically. And they could potentially learn to implement lots of small programs that each capture a nugget of knowledge and run in parallel, interacting to produce very complicated effects.
However, the computational power of RNNs makes them very hard to train. It is quite difficult to train a RNN because of the exploding or vanishing gradients problem. As we backpropagate through many layers, what happens to the magnitude of the gradients? If the weights are small, the gradients shrink exponentially. If the weights are big, the gradients grow exponentially. Typical feed-forward neural nets can cope with these exponential effects because they only have a few hidden layers. On the other hand, in a RNN trained on long sequences, the gradients can easily explode or vanish. Even with good initial weights, it’s very hard to detect that the current target output depends on an input from many time-steps ago, so RNNs have difficulty dealing with long-range dependencies.
There are essentially 4 effective ways to learn a RNN:
  • Long Short Term Memory: Make the RNN out of little modules that are designed to remember values for a long time.
  • Hessian Free Optimization: Deal with the vanishing gradients problem by using a fancy optimizer that can detect directions with a tiny gradient but even smaller curvature.
  • Echo State Networks: Initialize the input -> hidden and hidden -> hidden and output -> hidden connections very carefully so that the hidden state has a huge reservoir of weakly coupled oscillators which can be selectively driven by the input.
  • Good initialization with momentum: Initialize like in Echo State Networks, but then learn all of the connections using momentum.

4 — Long/Short Term Memory Network

Hochreiter & Schmidhuber (1997) solved the problem of getting a RNN to remember things for a long time (like hundreds of time steps) by building what known as long-short term memory network. They designed a memory cell using logistic and linear units with multiplicative interactions. Information gets into the cell whenever its “write” gate is on. The information stays in the cell so long as its “keep” gate is on. Information can be read from the cell by turning on its “read” gate.
Reading cursive handwriting is a natural task for an RNN. The input is a sequence of (x, y, p) coordinates of the tip of the pen, where p indicates whether the pen is up or down. The output is a sequence of characters. Graves & Schmidhuber (2009) showed that RNNs with LSTM are currently the best systems for reading cursive writing. In brief, they used a sequence of small images as input rather than pen coordinates.

5 — Hopfield Networks

Recurrent networks of non-linear units are generally very hard to analyze. They can behave in many different ways: settle to a stable state, oscillate, or follow chaotic trajectories that cannot be predicted far into the future. A Hopfield net is composed of binary threshold units with recurrent connections between them. In 1982, John Hopfield realized that if the connections are symmetric, there is a global energy function. Each binary “configuration” of the whole network has an energy; while the binary threshold decision rule causes the network to settle for a minimum of this energy function. A neat way to make use of this type of computation is to use memories as energy minima for the neural net. Using energy minima to represent memories gives a content-addressable memory. An item can be accessed by just knowing part of its content. It is robust against hardware damage.
Each time we memorize a configuration, we hope to create a new energy minimum. But what if two nearby minima at an intermediate location? This limits the capacity of a Hopfield net. So how do we increase the capacity of a Hopfield net? Physicists love the idea that the math they already know might explain how the brain works. Many papers were published in physics journals about Hopfield nets and their storage capacity. Eventually, Elizabeth Gardnerfigured out that there was a much better storage rule that uses the full capacity of the weights. Instead of trying to store vectors in one shot, she cycled through the training set many times and used the perceptron convergence procedure to train each unit to have the correct state given the states of all the other units in that vector. Statisticians call this technique “pseudo-likelihood.”
There is another computational role for Hopfield nets. Instead of using the net to store memories, we use it to construct interpretations of sensory input. The input is represented by the visible units, the interpretation is represented by the states of the hidden units, and the badness of the interpretation is represented by the energy.

6 — Boltzmann Machine Network

A Boltzmann machine is a type of stochastic recurrent neural network. It can be seen as the stochastic, generative counterpart of Hopfield nets. It was one of the first neural networks capable of learning internal representations and is able to represent and solve difficult combinatoric problems.
The goal of learning for Boltzmann machine learning algorithm is to maximize the product of the probabilities that the Boltzmann machine assigns to the binary vectors in the training set. This is equivalent to maximizing the sum of the log probabilities that the Boltzmann machine assigns to the training vectors. It is also equivalent to maximizing the probability that we would obtain exactly the N training cases if we did the following: 1) Let the network settle to its stationary distribution N different time with no external input; and 2) Sample the visible vector once each time.
An efficient mini-batch learning procedure was proposed for Boltzmann Machines by Salakhutdinov and Hinton in 2012.
  • For the positive phase, first initialize the hidden probabilities at 0.5, then clamp a data vector on the visible units, then update all the hidden units in parallel until convergence using mean field updates. After the net has converged, record PiPj for every connected pair of units and average this over all data in the mini-batch.
  • For the negative phase: first keep a set of “fantasy particles.” Each particle has a value that is a global configuration. Then sequentially update all the units in each fantasy particle a few times. For every connected pair of units, average SiSj over all the fantasy particles.
In a general Boltzmann machine, the stochastic updates of units need to be sequential. There is a special architecture that allows alternating parallel updates which are much more efficient (no connections within a layer, no skip-layer connections). This mini-batch procedure makes the updates of the Boltzmann machine more parallel. This is called a Deep Boltzmann Machine (DBM), a general Boltzmann machine with a lot of missing connections.
In 2014, Salakhutdinov and Hinton came up with another update for their model, calling it Restricted Boltzmann Machines. They restrict the connectivity to make inference and learning easier (only one layer of hidden units and no connections between hidden units). In an RBM it only takes one step to reach thermal equilibrium when the visible units are clamped.
Another efficient mini-batch learning procedure for RBM goes like this:
  • For the positive phase, first clamp a data vector on the visible units. Then compute the exact value of <ViHj> for all pairs of a visible and a hidden unit. For every connected pair of units, average <ViHj> over all data in the mini-batch.
  • For the negative phase, also keep a set of “fantasy particles.” Then update each fantasy particle a few times using alternating parallel updates. For every connected pair of units, average ViHj over all the fantasy particles.

7 — Deep Belief Network

Back-propagation is considered the standard method in artificial neural networks to calculate the error contribution of each neuron after a batch of data is processed. However, there are some major problems using back-propagation. Firstly, it requires labeled training data; while almost all data is unlabeled. Secondly, the learning time does not scale well, which means it is very slow in networks with multiple hidden layers. Thirdly, it can get stuck in poor local optima, so for deep nets they are far from optimal.
To overcome the limitations of back-propagation, researchers have considered using unsupervised learning approaches. This helps keep the efficiency and simplicity of using a gradient method for adjusting the weights, but also use it for modeling the structure of the sensory input. In particular, they adjust the weights to maximize the probability that a generative model would have generated the sensory input. The question is what kind of generative model should we learn? Can it be an energy-based model like a Boltzmann machine? Or a causal model made of idealized neurons? Or a hybrid of the two?
A belief net is a directed acyclic graph composed of stochastic variables. Using belief net, we get to observe some of the variables and we would like to solve 2 problems: 1) The inference problem: Infer the states of the unobserved variables, and 2) The learning problem: Adjust the interactions between variables to make the network more likely to generate the training data.
Early graphical models used experts to define the graph structure and the conditional probabilities. By then, the graphs were sparsely connected; so researchers initially focused on doing correct inference, not on learning. For neural nets, learning was central and hand-writing the knowledge was not cool, because knowledge came from learning the training data. Neural networks did not aim for interpretability or sparse connectivity to make inference easy. Nevertheless, there are neural network versions of belief nets.
There are two types of generative neural network composed of stochastic binary neurons: 1) Energy-based, in which we connect binary stochastic neurons using symmetric connections to get a Boltzmann Machine; and 2) Causal, in which we connect binary stochastic neurons in a directed acyclic graph to get a Sigmoid Belief Net. The descriptions of these two types go beyond the scope of this article.

8 — Deep Auto-encoders

Finally, let’s discuss deep auto-encoders. They always looked like a really nice way to do non-linear dimensionality reduction because of a few reasons: They provide flexible mappings both ways. The learning time is linear (or better) in the number of training cases. And the final encoding model is fairly compact and fast. However, it turned out to be very difficult to optimize deep auto encoders using back propagation. With small initial weights, the back propagated gradient dies. We now have a much better ways to optimize them; either use unsupervised layer-by-layer pre-training or just initialize the weights carefully as in Echo-State Nets.
For pre-training task, there are actually 3 different types of shallow auto-encoders:
  1. RBM’s as auto-encoders: When we train an RBM with one-step contrastive divergence, it tries to make the reconstructions look like data. It’s like an auto encoder, but it’s strongly regularized by using binary activities in the hidden layer. When trained with maximum likelihood, RBMs are not like auto encoders. We can replace the stack of RBM’s used for pre-training by a stack of shallow auto encoders; however pre-training is not as effective (for subsequent discrimination) if the shallow auto encoders are regularized by penalizing the squared weights.
  2. Denoising auto encoders: These add noise to the input vector by setting many of its components to 0 (like dropout, but for inputs). They are still required to reconstructing these components so they must extract features that capture correlations between inputs. Pre-training is very effective if we use a stack of denoting auto encoders. It’s as good as or better than pre-training with RBMs. It’s also simpler to evaluate the pre-training because we can easily compute the value of the objective function. It lacks the nice variational bound we get with RBMs, but this is only of theoretical interest.
  3. Contractive auto encoders: Another way to regularize an auto encoder is to try to make the activities of the hidden units as insensitive as possible to the inputs; but they cannot just ignore the inputs because they must reconstruct them. We achieve this by penalizing the squared gradient of each hidden activity with respect to the inputs. Contractive auto encoders work very well for pre-training. The codes tend to have the property that only a small subset of the hidden units are sensitive to changes in the input.
In brief, there are now many different ways to do layer-by-layer pre-training of features. For datasets that do not have huge numbers of labeled cases, pre-training helps subsequent discriminative learning. For very large, labeled datasets, initializing the weights used in supervised learning by using unsupervised pre-training is not necessary, even for deep nets. Pre-training was the first good way to initialize the weights for deep nets, but now there are other ways. But if we make the nets much larger, we will need pre-training again!
Last Takeaway
Neural networks are one of the most beautiful programming paradigms ever invented. In the conventional approach to programming, we tell the computer what to do, breaking big problems up into many small, precisely defined tasks that the computer can easily perform. By contrast, in a neural network we don’t tell the computer how to solve our problem. Instead, it learns from observational data, figuring out its own solution to the problem at hand.
Today, deep neural networks and deep learning achieve outstanding performance on many important problems in computer vision, speech recognition, and natural language processing. They’re being deployed on a large scale by companies such as Google, Microsoft, and Facebook.
I hope that this post helps you learn the core concepts of neural networks, including modern techniques for deep learning. You can get all the lecture slides, research papers and programming assignments I have done for Dr. Hinton’s Coursera course from my GitHub repo here. Good luck studying!
— —

Interested for our works and services?
Get more of our update !