top of page
Gradient With Circle
Image by Nick Morrison

Insights Across Technology, Software, and AI

Discover articles across technology, software, and AI. From core concepts to modern tech and practical implementations.

Building a Binary Classification Model with Keras in Python

  • Dec 13, 2024
  • 6 min read

Updated: May 28

In this section, we’ll dive into how to create a simple binary classification model using Keras. This type of model is useful when you're trying to predict one of two classes (e.g., yes/no, true/false, 0/1). We'll go through the steps of creating the model architecture, compiling it, and setting up the necessary configurations.

Binary Classification Model with Keras in Python - colabcodes

Introduction to Keras and TensorFlow

Keras is a high-level deep learning API that runs on top of TensorFlow, designed to simplify the process of building and training deep learning models. It provides an intuitive interface for defining neural networks and streamlines tasks such as model creation, training, and evaluation. TensorFlow, on the other hand, is a powerful and flexible open-source machine learning framework developed by Google, which serves as the backbone for Keras. TensorFlow offers comprehensive support for various machine learning and deep learning tasks, including natural language processing, computer vision, and reinforcement learning. By combining the ease of Keras with the robustness of TensorFlow, developers can quickly build complex models while leveraging the computational power and scalability that TensorFlow provides. Whether you're a beginner or an expert, the Keras API makes it easier to implement machine learning algorithms without getting lost in the complexities of the underlying TensorFlow framework.


Implementing Binary Classification Model with Keras in Python

To implement a binary classification model with Keras, you start by defining the architecture using a Sequential model. The model consists of an input layer that accepts the feature vector, followed by one or more hidden layers with ReLU activation functions to introduce non-linearity. Finally, the output layer uses a sigmoid activation function, which outputs a probability value between 0 and 1, representing the likelihood of the input belonging to one of the two classes. After defining the model architecture, you compile it with an appropriate optimizer, like RMSprop, and a loss function, such as binary cross-entropy, which is ideal for binary classification tasks.


Step 1: Importing Required Libraries

Before we begin building the model, we need to import TensorFlow and Keras. The code snippet uses keras from TensorFlow, which is a high-level API that simplifies neural network construction.

from tensorflow import keras 
from tensorflow.keras import layers

Step 2: Defining the Model Architecture

After preparing the dataset, the next step is to define the neural network architecture. Since this is a binary classification problem, the model is designed to predict one of two possible outcomes, such as yes/no, true/false, or positive/negative.


The architecture consists of three main parts: an input layer, two hidden layers, and an output layer. The input layer receives the feature values from the dataset, the hidden layers learn patterns and relationships within the data, and the output layer generates the final prediction.

inputs = keras.Input(shape=(num_input_features,)) 
x = layers.Dense(32, activation="relu")(inputs) 
x = layers.Dense(32, activation="relu")(x) 
outputs = layers.Dense(1, activation="sigmoid")(x)

In this model, num_input_features represents the number of input variables available in the dataset. For example, if a dataset contains attributes such as age, income, and spending score, then the input layer will expect three features for each sample.


The first hidden layer contains 32 neurons and uses the ReLU (Rectified Linear Unit) activation function. ReLU helps the network learn complex and non-linear relationships by converting negative values to zero while keeping positive values unchanged. The output of this layer is then passed to a second hidden layer, which also contains 32 neurons and uses the same activation function.


Finally, the output layer consists of a single neuron with a sigmoid activation function. The sigmoid function converts the network’s output into a value between 0 and 1, which can be interpreted as the probability of belonging to the positive class. For example, an output of 0.85 indicates an 85% probability that the input belongs to class 1, while a value closer to 0 indicates a higher likelihood of belonging to class 0.


This simple yet effective architecture serves as a strong foundation for many binary classification tasks and can be further customized by adjusting the number of layers, neurons, or activation functions based on the complexity of the problem.


Step 3: Compiling the Model

After defining the architecture, the next step is to compile the model. In Keras, compiling involves configuring the model for training. Specifically, you need to specify the optimizer and the loss function. For binary classification tasks, the commonly used loss function is binary cross-entropy, and for optimization, we are using RMSprop, which is an adaptive learning rate optimizer.

model = keras.Model(inputs, outputs) model.compile(optimizer="rmsprop", loss="binary_crossentropy")

keras.Model(inputs, outputs) creates a complete Keras model by connecting the previously defined input and output layers. This model object serves as the central component that will be trained and used for making predictions.


The optimizer="rmsprop" argument specifies the optimization algorithm used during training. RMSprop is an adaptive learning rate optimizer that adjusts the learning rate for each parameter based on recent gradient updates. This often helps the model converge faster and achieve better performance, especially when working with neural networks.


The loss="binary_crossentropy" argument defines the loss function that the model will use to evaluate prediction errors. Binary cross-entropy is specifically designed for binary classification tasks, where the target variable has two possible outcomes, typically represented as 0 and 1. The loss function penalizes incorrect predictions and provides feedback that guides the optimizer in updating the model's weights.


Step 4: Training the Model

Once the model has been compiled, the next step is to train it using the training dataset. During training, the model learns the relationship between the input features and the target labels by adjusting its internal weights to minimize the loss function.

In Keras, training is performed using the fit() method:

history = model.fit(
    x_train,
    y_train,
    epochs=20,
    batch_size=32,
    validation_split=0.2)

In this example, x_train contains the input features and y_train contains the corresponding binary labels (0 or 1). The epochs parameter specifies how many times the model will iterate through the entire training dataset, while batch_size determines how many samples are processed before the model updates its weights.


The validation_split parameter reserves a portion of the training data for validation. This allows you to monitor how well the model performs on unseen data during training and helps identify issues such as overfitting.


As training progresses, Keras displays metrics such as loss and validation loss for each epoch. Ideally, both values should decrease over time, indicating that the model is learning meaningful patterns from the data.


Step 5: Making Predictions and Evaluating the Model

After training, the model can be used to make predictions on new or unseen data. Since the output layer uses a sigmoid activation function, the model produces probability values between 0 and 1.

predictions = model.predict(x_test)

Each prediction represents the probability that a sample belongs to the positive class. For example, a prediction of 0.92 indicates a 92% probability that the sample belongs to class 1, while a prediction of 0.08 suggests it likely belongs to class 0.


To convert probabilities into class labels, a threshold is typically applied. The most common threshold is 0.5:

predicted_classes = (predictions > 0.5).astype("int32")

Any value greater than 0.5 is classified as class 1, while values below 0.5 are classified as class 0.


To measure the effectiveness of the model, it is important to evaluate its performance on a separate test dataset.

test_loss = model.evaluate(x_test, y_test)
print("Test Loss:", test_loss)

For binary classification problems, common evaluation metrics include:


  1. Accuracy: Measures the percentage of correctly classified samples. While accuracy is easy to understand, it may be misleading when working with highly imbalanced datasets.

  2. Precision: Measures how many of the samples predicted as positive are actually positive. Precision is particularly important in situations where false positives are costly.

  3. Recall: Measures how many actual positive samples are correctly identified by the model. Recall is useful when missing positive cases is undesirable.

  4. F1-Score: Provides a balance between precision and recall by combining both metrics into a single score.

  5. ROC-AUC: Evaluates the model's ability to distinguish between the two classes across different classification thresholds. A higher AUC score generally indicates better classification performance.


By analyzing these metrics, you can determine how well the model generalizes to unseen data and identify opportunities for improvement. Once satisfied with the results, the trained model can be deployed to make predictions in real-world binary classification applications such as spam detection, customer churn prediction, fraud detection, and disease diagnosis.


Conclusion

Binary classification serves as the foundation for many real-world machine learning applications, enabling systems to make decisions between two possible outcomes with remarkable efficiency. While building a model in Keras requires only a few lines of code, creating a reliable classifier involves much more than defining layers and training on data. The quality of the dataset, feature selection, model architecture, and evaluation strategy all play a crucial role in determining how well the model performs in practical scenarios.

As you develop binary classification models, it is important to focus not only on achieving high accuracy but also on understanding the model's behavior through appropriate evaluation metrics. A model that performs well on unseen data and consistently makes reliable predictions is far more valuable than one that simply memorizes patterns from the training set.

Keras provides an accessible and flexible framework for experimenting with neural networks, making it an excellent choice for beginners and experienced practitioners alike. With a solid understanding of the fundamentals covered in this tutorial, you are well-positioned to tackle more advanced classification problems, explore deeper architectures, and build intelligent systems capable of supporting data-driven decision-making across a wide range of domains.

Get in touch for customized mentorship, research and freelance solutions tailored to your needs.

bottom of page