Energy Efficiency and Energy Intensity

Energy efficiency or efficient energy usage is a goal, aiming to reduce energy consumption by adopting new technology/means. For example, using LED bulbs instead of CFLs for lighting purpose reduces both energy consumption and the cost. Energy efficiency is an important means to handle growing energy demands or for sustainable energy.

Energy Intensity is a quantitative metric for nation’s energy efficiency.  For a country, it is calculated as the ratio of the total energy units consumed and the total units of GDP over a calendar year. The lower its value, the better is the energy efficiency. A higher value of energy intensity means more money is being spent on energy to convert it into nation’s GDP, which is bad.


Better Presentations – II

In this post, I will write about the TED talk by Mr. Richard Greene, author of the book, “The Words that Shook the World: 100 Years of Unforgettable Speeches and Events”.  In the talk, “The 7 secrets of the greatest speakers in history“, Richard first explains seven ingredients of a great talk, and then he shows video snippets of few great speakers including Loe Gehring, Winston Churchill, John F. Kennedy, Martin Luther King, among others. Here, I will re-phrase only the seven ingredients of his talk in five bullet points

  1. Words, Voice tone, and Body language: Richard’s experience/research shows that words, voice tone, and body language affect 7%, 38% and 55% of the audiences respectively. This means that uttering correct words/speech and neither having a variation (fast/slow according to condition) in voice tone nor using body language (looking, moving) will attract only 7% of the audience and the remaining will feel lost. Therefore to remain connected with the audience during the entire presentation, the presenter should ensure all three in appropriate proportions.
  2. Compelling Message: Ensure that your presentation has only one idea and try to plant the same idea in the audience.
  3. Only Conversation: Richard explains the difference between performance, presentation, and conversation. He further points out that a speaker often thinks his “talk” as a performance or a presentation. Thinking talk as “performance” is absurd because performance relates to acting and the talk is not acting, thinking talk as a “presentation” is not fully true because then you only think of the audience and not of yourself. Instead, talk is “conversation”. In the words of Richard, “Talk/public speaking is nothing but a conversation from your heart about something that you are authentically passionate about. “
  4. Four Languages of Communication: Presenter should ensure that his(r) presentation has following four languages. (i) Visual: Show visual representation of as many concepts as possible (ii) Auditory: Summarize findings through stories (iii) Auditory digital:  Support findings with analytical and statistical measures/numbers (iv) Kinesthetic: Develop means of connecting with audience, i.e., audience should feel the message/idea
  5. Authentic Passion: This is the most important one among all the previously mentioned. The presenter must ensure that the message is worth to present.

Better Presentations – I

A few days back I attended a doctoral symposium along with my fellow PhDs at NIIT Universiy, Neemrana, Rajasthan. It was a two-day event in which 17 Ph.D. students from elite educational institutions of India and few experts from Industry and Academia (IIT Profs.) gave presentations. Each presentation was around 15 – 20 minutes followed by 1-2 questions. After the symposium, I along with my colleagues discussed the presentations, i.e., which presentations were catchy and what were the reasons for boring presentations. I won’t repeat the discussion here, but I would like to say that the debate intrigued me to dig deep and understand the characteristics of better presentations.

In this post and another subsequent post, I will nail down the ingredients of a better presentation. In all these posts I will present views of world-class presenters. This post is based on a TED (Technology, Entertainment, and Design) talk by  Chris Anderson,  “TED’s secret to great public speaking“. Chris is the owner of the TED, a non-profit organization, organizing talks on ideas worth sharing. Chris mentions that there is no unique formula for giving a better presentation. It is not like that you tell catchy stories on red-velvet, rather presentation is means of transferring an idea from a presenter to an audience. He defines this idea transformation process as a synchronization problem, i.e., if the speaker’s and listeners’ minds get synched with one another then idea gets transferred quickly without losing the audience. To make sure that the synchronization will happen between the two parties while presenting, Chris mentions that a speaker should prepare his(r) talk while ensuring following ingredients in a talk:

  1. Develop presentation around one idea: Make sure that the presentation surrounds around one idea. All slides should support the idea.
  2. Develop curiosity in the audience: Provide motivation of the presentation, i.e., why is the presentation important and why should audience pay attention. Also, ask questions in between –  this will ensure uninterrupted bonding between the two participating parties.
  3. Develop each piece of presentation while keeping audience’s awareness in mind:  This means that the presentation should be at listener’s cognition level. Wait! how is it possible while delivering technical stuff to a non-domain audience? The only way to handle this issue is to use “metaphors” as often as possible. Using metaphors at audience level will require a little bit of creativity and time but it is worth to spend the time to get this skill.
  4. Ensure that the presentation adds value to the audience: If you find that the presentation will help audience then you should present, else it is better not to present.

Sustainable Energy

In Oxford dictionary sustainable means “able to be maintained at a certain rate or level.”  According to the United Nations sustainability is defined as “meeting the needs of the present without compromising the ability of future generations to meet their own needs .” The important question to ask is why we are discussing sustainable energy, and the answer is either we have limited sources of energy or we are polluting our environment at the incredibly fast rate. In fact, we are facing both of these challenges, but at the consumer level, we don’t realize these.

The major sources of energy are coal, oil and natural gas. Two major problems with these resources are they are: 1) limited and are getting depleted; this means that our future generations will face energy scarcity. 2) Coal, the main energy source emit lots of carbon dioxide, a green house gas which eventually results in the greenhouse effect. The greenhouse effect deals with heating of our climate which eventually melts the glaciers raises water levels and affects global ecosystem badly.

Sustainable energy aims to find solutions to our existing energy problem by proposing renewable energy sources which regenerate naturally and produce clean energy. These sources include solar, wind, water, biogas, geothermal energy. All these sources are inexhaustible. Sustainable energy also includes the practices of energy efficiency and conservation.

Stephen Pacala, an environment biologist at Princeton University mentions that we can handle the increasing carbon dioxide challenge with following four options:

  1. Efficiency: Develop technologies or appliances which are energy efficient
  2. Tripling our nuclear power plants
  3. Cleaning coal plants by burying carbon emissions
  4. Harnessing SUN’s energy using solar panels etc.

Eigen vectors and Eigen values

A point x in a two-dimensional space represents a vector because it has a magnitude and a direction with respect to the center (0, 0). A scalar multiplication of x represents another vector which lies on the same line (elongated or scaled down) as that of vector x.  When we multiply vector x with a matrix A, it again results in a vector but now the resultant vector will be either in the same previous direction as that of x or in a new direction. Also, the resultant vector will get either scaled up or down. If the resultant vector lies in the same direction then we say vector x is Eigen vector of matrix A, otherwise, it is not an Eigen vector. A 96 seconds youtube video explains the same concept visually.

Corresponding to Eigen vector, we too get a scalar value (\lambda) which on multiplying vector x results in the same vector as that obtained by above matrix multiplication. Mathematically,

A*x =   \lambda*x

Here, x refers to Eigen Vector and \lambda refers  to Eigen value.




Illustration of k value effect on outlier score

Continuing with the previous post, here, I will illustrate how outlier scores vary while considering different k values. The context of below figure is already explained in my previous post.

Screen Shot 2016-08-22 at 11.02.08

After running the LOF algorithm with following R code lines

library(Rlof) # for applying local outlier factor
library(HighDimOut) # for normalization of lof scores
df <- data.frame(x = c( 5, rnorm(2,20,1), rnorm(3,30,1), rnorm(5,40,1), rnorm(9,10,1), rnorm(10,37,1)))
df$y <- c(38, rnorm(2,30,1), rnorm(3,10,1), rnorm(5,40,1), rnorm(9,20,1), rnorm(10,25,1))
#pdf("understandK.pdf", width = 6, height = 6)
plot(df$x, df$y, type = "p",  ylim = c(min(df$y), max(df$y) + 5), xlab = "x", ylab = "y")
text(df$x, df$y, pos = 3, labels = 1:nrow(df), cex = 0.7)
lofResults <- lof(df, c(2:10), cores = 2)
apply(lofResults, 2, function(x) Func.trans(x,method = "FBOD"))

We get the outlier scores for 30 days on a range of k = [2:10] as follows:

Screen Shot 2016-08-22 at 11.11.00

Before explaining results further, I present the distance matrix as below, where each entry shows the distance between days X and Y. Here, X represents row entry and Y represents column entry.

Screen Shot 2016-08-22 at 11.22.44

Let us understand how outlier scores get assigned to day 1 on different k’s in the range of 2:10. The neighbours of point 1 in terms of increasing distance are:

Screen Shot 2016-08-22 at 16.50.30

Here the first row represents neighbour and the second row represents the distance between point 1 and the corresponding point. While noticing the outlier values of point 1, we find till k = 8, outlier score of point 1 are very high (near to 1). The reason for this is that the density of  k neighbours of point 1 till k = 8 is high as compared to point 1. This results in higher outlier score to point 1. But, when we set k = 9, outlier score of point 1 drops to 0. Let us dig it deep further. The 8th and 9th neighbours of point 1 are points 18 and 17 respectively. The neighbours of point 18 in increasing distance are:

Screen Shot 2016-08-22 at 17.02.37

and the neighbours of point 17 are:

Screen Shot 2016-08-22 at 17.03.13

Observe carefully, that 8th neighbour of point 1 is point 18, and the 8th neighbour of point 18 is point 19. While checking the neighbours of point 18 we find that all of its 8 neighbours are nearby (in cluster D). This results in higher density for all k neighbours of point 1 till 8th neighbour as all these points are densest as compared to point 1, and hence point 1 with lesser density gets high anomaly score. On the other hand, 9th neighbour of point 1 is point 17 that has 9th neighbour as point 3. On further checking, we find that for all the points which are in cluster D now find their 9th neighbour  either in cluster A or cluster B. This essentially decreases the density of all the considered neighbours of point 1. As a result, now all the points including point 1 and its 9 neighbours have densities in the similar range and hence point 1 gets low outlier score.

I believe that this small example explains how outlier scores vary with different k’s. Interested readers can use the provided R code to understand this example further.

Intuitive​ meaning of k range in Local Outlier Factor (LOF)

The Local Outlier Factor (LOF) is a well-known outlier detection algorithm. In the previous post, I noted down the steps of LOF and here I will discuss its k parameter.  The k parameter often lands the users of LOF into difficulty, but while looking at the meaning of k parameter and the respective application domain, I find it is easy to select a k range. The authors of LOF suggest to use a range of k values instead of using a selective value. This is because we cannot generalise a particular value of k over various datasets following diverse underlying data distributions. Now, let us understand how to select lower (lwrval) and upper (uprval) values of the k range.

To explain it further, let us consider a simple scenario shown in below figure

Screen Shot 2016-08-22 at 11.02.08

This figure shows the energy consumption of some imaginary home for one month (30 days). Each small circle represents energy consumption of a particular day, where a number above the circle shows the corresponding day of the month.  Nearby circles marked within red clusters  (A, B, C, D, E) represent days that follow a similar pattern in energy consumption as compared to remaining days.

To use LOF on such a dataset, we need to set the range of k values instead of a single k value. Note that lwrval and uprval are domain dependent. According to LOF paper, lwrval and uprval are defined as:

  • lwrval: This refers to the minimal cluster size which consists of similar behaving points, and we believe this similarity is not due to some random cause. This means that we assume a cluster with a size lower than lwrval represent outliers. For example, if I consider lwrval = 3, then clusters A and B represent outliers because none of the points within these clusters has three more similar points/neighbours. At the same time, points within clusters C, D, and E represent normal points because each of them has three more like neighbours.
  • uprval: This suggests to the upper optimal number of points to be similar. In other words, we believe that uprval number of points must be similar in the considered application domain. For example, In the energy domain, I know that at least for 6 days (working days of a week) energy consumption is similar due to the occupancy behaviour. So, I set the uprval = 6. No doubt there can be a cluster with size greater than uprval, but our reasoning on a specific dataset motivates us for some optimal uprval. Consider an another example where we assume that occupants of a home change on a weekly basis – say there were 5,  10, 15, and 20 occupants on the first, second, third and fourth week of a month respectively. Consequently, the energy consumption on four different weeks should be similar intra-week and different inter-week. This example suggests that we should get four clusters corresponding to four weeks and the size of each cluster should be 7 (number of weekdays). So, our uprval is 7 in this example.

I believe now lwrval and uprval limits can be easily interpreted for any application domain. Therefore, according to original LOF paper now we can calculate LOF outlier values on a set of k values defined by lwrval and uprval. In the next post, I will explain the above figure further and show how a particular k value effects outlier score.